RFC 7871 – Client Subnet in DNS Queries – defines a mechanism for recursive resolvers like Google Public DNS to send partial client IP address information to authoritative DNS name servers. Content Delivery Networks (CDNs) and latency-sensitive services use this to give accurate geo-located responses when responding to name lookups coming through public DNS resolvers.
The RFC describes ECS features that authoritative name servers must implement; but implementers don’t always follow those requirements. There are also ECS operational and deployment issues the RFC does not address that can cause problems for resolvers like Google Public DNS that auto-detect ECS support in authoritative name servers, as well as resolvers that require ECS whitelisting, like OpenDNS.
These guidelines are intended to help authoritative DNS implementers and operators avoid many common mistakes that can cause problems for ECS.
Definitions of Terms
We use the following terms to describe ECS operations:
A name server implements (or supports) ECS if it replies to ECS queries with ECS responses that have matching ECS options (even if the ECS options always have a global /0 scope prefix length).
A zone is ECS-enabled if ECS queries to its name servers sent with a non-zero source prefix receive ECS responses with a non-zero scope.
Guidelines for Authoritative Name Servers
All authoritative name servers for an ECS-enabled zone must enable ECS for the zone.
- Even if only one name server does not implement ECS or enable it for the zone, it quickly becomes the source of most cached data. Because its responses have global scope they are used (until their TTL expires) as the response to all queries for the same name (regardless of client subnet). Responses from servers that do implement ECS and enable it are only used for queries from clients within the specific scope, so they are much less likely to be used than the global scope responses.
Authoritative name servers that implement ECS MUST2 send ECS responses to ECS queries for all zones served from an IP address or NS hostname, even for zones that are not ECS-enabled.
- Google Public DNS auto-detects ECS support by IP address rather than name server hostname or DNS zone because the number of addresses is smaller than the number of name server hostnames and much smaller than the number of DNS zones. If an authoritative name server does not always send ECS responses to ECS queries (even for zones that are not ECS-enabled), Google Public DNS may stop sending it ECS queries.
Authoritative name servers that implement ECS must respond to all ECS queries with ECS responses, including negative and referral responses.
The same issues about auto-detection of ECS support apply here too.
Negative responses (NXDOMAIN and NODATA) SHOULD3 use global /0 scope for better caching and compatibility with RFC 7871.
Besides NXDOMAIN and NODATA (NOERROR with empty answer section), other error responses to ECS queries (particularly SERVFAIL and REFUSED) should include a matching ECS option with global /0 scope.
If an authoritative name server is attempting to shed load from a DoS attack, it can return a SERVFAIL without ECS data; doing this repeatedly causes Google Public DNS to stop sending queries with ECS (which may reduce the number of legitimate queries they send, but would not affect random subdomain attack queries). Reducing legitimate query load during a DoS attack may or may not improve the success rate for legitimate queries (although responses can be served from cache for all clients).
A more effective load-shedding approach is to send all responses with global /0 scope so that Google Public DNS continues to send ECS queries. This lets Google Public DNS return geo-located responses much sooner after the attack stops, as it does not need to re-detect ECS support, just to re-query once the global scope response TTLs expire.
Referral (delegation) responses must also have matching ECS data and SHOULD4 use a global /0 scope. Note that delegation responses are never forwarded to the clients whose addresses appear in ECS data, so any geo-located NS, A, or AAAA records should be selected by the resolver's client IP address, not ECS data.
Authoritative name servers that implement ECS must include a matching ECS option in responses to all query types received with an ECS option. It's not good enough to respond to IPv4 address (A) queries with ECS data; responses to A, AAAA, PTR, MX, or any other query type must have matching ECS data or resolvers may drop the response as a possibly forged response, and Google Public DNS may stop sending queries with ECS data.
In particular, ECS responses to SOA, NS, and DS queries should always use global /0 scope for better caching and a consistent view of delegation (geo-located responses to A/AAAA queries for name server hostnames are OK). Responses to any query type (e.g. TXT, PTR, etc) that do not change based on ECS data should not use a scope equal to the source prefix length, they should use a global /0 scope for better caching and reduced query load.
Authoritative name servers returning ECS-enabled CNAME responses SHOULD5 only include the first CNAME in the chain, and the final target of the CNAME chain should be ECS-enabled to the same scope prefix length. Because of ambiguity in the ECS specification, some recursive resolvers (notably Unbound6) may return a response with the scope of the final non-CNAME domain (/0 if it is not ECS-enabled).
ECS data may contain IPv6 addresses even for IPv4-only name servers (and vice-versa, although IPv6-only name servers are rare).
Name servers need to respond with valid ECS option data (/0 scope is OK, but source address and prefix length must match).
ECS for a zone can be enabled separately for IPv4 and IPv6 addresses.
Authoritative name servers returning ECS-enabled responses MUST NOT7 overlap scope prefixes in their answers. An example of overlapping scope prefixes would be the following:
Query with source prefix
198.18.0.0/15: response A with scope prefix
Query with source prefix
198.51.100/24: response B with scope prefix
If a client queries an ECS-enabled recursive resolver in the order above, both queries may get response A, because the scope of the cached response A includes the second query’s prefix scope. Even if the client queries are made in the opposite order, and both queries are forwarded to authoritative name servers, cached responses may expire at different times; subsequent queries to the recursive resolver in the overlapping prefix
198.51.100/24could get either response A or B.
When implementing ECS support for the first time on name servers, use new IP addresses for name servers serving these ECS enabled zones.
When authoritative name servers that implemented ECS but returned global scope results start returning ECS enabled answers for a zone, Google Public DNS starts returning geo-located responses to queries as soon as the TTLs of previous global scope responses expire.
Google Public DNS auto-detection of ECS support very rarely tries ECS queries for an IP address (or name server hostname) when it has auto-detected lack of ECS support (timeouts, returning FORMERR, BADVERS, or sending non-ECS responses). New ECS implementations on those IP addresses (or NS hostnames) are auto-detected very slowly, or not at all.
Make sure that network connections are reliable and that any response rate limiting is set sufficiently high that name servers do not drop queries (or worse, respond with errors lacking a matching ECS option).
- For name servers implementing response rate limiting on ECS queries, the best response is NODATA with the truncation (TC) flag set, containing only a matching question section and a matching ECS option.
Send timely responses to all queries (ideally within 1 second).
- Using online Geo-IP lookup services for ECS queries won't work reliably, as the cumulative latency of the DNS query and online Geo-IP service is unlikely to be within one second. Google Public DNS auto-detection of ECS support considers delayed responses an indication of poor or incomplete ECS support, and reduces the likelihood that future queries are sent with ECS. If enough responses are delayed, it stops sending ECS queries.
RFC 7871 references and other footnotes
FAMILY, SOURCE PREFIX-LENGTH, and ADDRESS in the response MUST match those in the query. Echoing back these values helps to mitigate certain attack vectors, as described in
An Authoritative Nameserver that implements this protocol and receives an ECS option MUST include an ECS option in its response to indicate that it SHOULD be cached accordingly, regardless of whether the client information was needed to formulate an answer.
It is RECOMMENDED that no specific behavior regarding negative answers be relied upon, but that Authoritative Nameservers should conservatively expect that Intermediate Nameservers will treat all negative answers as /0; therefore, they SHOULD set SCOPE PREFIX-LENGTH accordingly.
The delegations case is a bit easier to tease out. In operational practice, if an authoritative server is using address information to provide customized delegations, it is the resolver that will be using the answer for its next iterative query. Addresses in the Additional section SHOULD therefore ignore ECS data, and the Authoritative Nameserver SHOULD return a zero SCOPE PREFIX-LENGTH on delegations.
For the specific case of a Canonical Name (CNAME) chain, the Authoritative Nameserver SHOULD only place the initial CNAME record in the Answer section, to have it cached unambiguously and appropriately. Most modern Recursive Resolvers restart the query with the CNAME, so the remainder of the chain is typically ignored anyway.
Using the scope of the final domain in a CNAME chain is harmless in Unbound, since it is usually deployed as a local stub or forwarding resolver, where all clients are in the same subnet and would get the same response.
Authoritative Nameservers might have situations where one Tailored Response is appropriate for a relatively broad address range, such as an IPv4 /20, except for some exceptions, such as a few /24 ranges within that /20. Because it can't be guaranteed that queries for all longer prefix lengths would arrive before one that would be answered by the shorter prefix length, an Authoritative Nameserver MUST NOT overlap prefixes.
When the Authoritative Nameserver has a longer prefix length Tailored Response within a shorter prefix length Tailored Response, then implementations can either:
Deaggregate the shorter prefix response into multiple longer prefix responses, or
Alert the operator that the order of queries will determine which answers get cached, and either warn and continue or treat this as an error and refuse to load the configuration.
When deaggregating to correct the overlap, prefix lengths should be optimized to use the minimum necessary to cover the address space, in order to reduce the overhead that results from having multiple copies of the same answer. As a trivial example, if the Tailored Response for 1.2.0/20 is A but there is one exception of 1.2.3/24 for B, then the Authoritative Nameserver would need to provide Tailored Responses for 1.2.0/23, 1.2.2/24, 1.2.4/22, and 1.2.8/21 all pointing to A, and 1.2.3/24 to B.