HTTP/3 negative cache never expires: one transient QUIC failure permanently disables HTTP/3 for a resolver until restart #3258
Closed
Omoeba
started this conversation in
Potential issues
Replies: 1 comment 1 reply
-
|
Should be fixed. Thanks to the two of you! |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
I was running dnscrypt-proxy 2.1.16 with
http3 = trueand noticed that after a while, the server would always fall back to h2. I noticed the issue did not occur in dnscrypt-proxy 2.1.8 (the version in debian trixie's apt repo), and did some digging to find the cause.When an HTTP/3 request to a DoH server fails, the server is added to the Alt-Svc negative cache with port 0 and is never retried over HTTP/3 again for the lifetime of the process. A single transient QUIC failure (packet loss, brief UDP/443 interference, a momentary blip) permanently downgrades that resolver to HTTP/2 until dnscrypt-proxy is restarted. SIGHUP / hot reload does not clear it.
Origin
Introduced with
http3_probein commit 5d41d10, first shipped in 2.1.9 and present unchanged through 2.1.16. 2.1.8 and earlier are unaffected, as they had no failure-triggered negative cache.Mechanism
altSupport.cache[host]is set to 0. This write is not gated onhttp3_probe; only the surrounding debug-log wording is.altSupportentries have no TTL and no eviction. The map is created once inNewXTransport()at startup and is never reset.altPort > 0guard), so a pinned host is never tried over h3 again.hasAltSupportis true, so the post-response Alt-Svc re-parse block (guarded by!hasAltSupport) is never re-entered for that host. There is no recovery path.xTransport, so a reload does not clear the cache. Only a full restart does.Scope
This affects the default Alt-Svc path (
http3 = true,http3_probe = false), not justhttp3_probe, and has since 2.1.9. A server that legitimately supports HTTP/3 is abandoned permanently after one transient failure, because the cache cannot distinguish a transient failure from genuine lack of support.Reproduction
http3 = trueandlog_level = 0. Use a DoH resolver that advertises h3 via Alt-Svc.Beta Was this translation helpful? Give feedback.
All reactions