rfc1536.txt revision c7ddab7655021d96211a26f99d9f694396c53284
40f53fa8d9c6a4fc38c0014495e7a42b08f52481David LawrenceNetwork Working Group A. Kumar
15a44745412679c30a6d022733925af70a38b715David LawrenceRequest for Comments: 1536 J. Postel
15a44745412679c30a6d022733925af70a38b715David LawrenceCategory: Informational C. Neuman
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley Common DNS Implementation Errors and Suggested Fixes
9c3531d72aeaad6c5f01efe6a1c82023e1379e4dDavid LawrenceStatus of this Memo
a3a11c4f3fc9ba972802b811c4d95a9884d6ff4aMichael Sawyer This memo provides information for the Internet community. It does
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley not specify an Internet standard. Distribution of this memo is
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley This memo describes common errors seen in DNS implementations and
2aa67e804d85f4d88153368ce65ce4df7b5390e6Bob Halley suggests some fixes. Where applicable, violations of recommendations
de8661e517ed679cfaa12e47eb9a8e23829ed320David Lawrence from STD 13, RFC 1034 and STD 13, RFC 1035 are mentioned. The memo
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley also describes, where relevant, the algorithms followed in BIND
de8661e517ed679cfaa12e47eb9a8e23829ed320David Lawrence (versions 4.8.3 and 4.9 which the authors referred to) to serve as an
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews The last few years have seen, virtually, an explosion of DNS traffic
76477bd0e0a8f150f06f45c347d286b782cfa679Brian Wellington on the NSFnet backbone. Various DNS implementations and various
108490a7f8529aff50a0ac7897580b59a73d9845David Lawrence versions of these implementations interact with each other, producing
a3a11c4f3fc9ba972802b811c4d95a9884d6ff4aMichael Sawyer huge amounts of unnecessary traffic. Attempts are being made by
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews researchers all over the internet, to document the nature of these
108490a7f8529aff50a0ac7897580b59a73d9845David Lawrence interactions, the symptomatic traffic patterns and to devise remedies
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley for the sick pieces of software.
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley This draft is an attempt to document fixes for known DNS problems so
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley people know what problems to watch out for and how to repair broken
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington1. Fast Retransmissions
ee80f4506479e189ca1320eb87ac89188c5a7848Mark Andrews DNS implements the classic request-response scheme of client-server
ee80f4506479e189ca1320eb87ac89188c5a7848Mark Andrews interaction. UDP is, therefore, the chosen protocol for communication
ee80f4506479e189ca1320eb87ac89188c5a7848Mark Andrews though TCP is used for zone transfers. The onus of requerying in case
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington no response is seen in a "reasonable" period of time, lies with the
1d8cbe855fc355b80802dcf29f4ac24bebdd1193Brian Wellington client. Although RFC 1034 and 1035 do not recommend any
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian WellingtonKumar, Postel, Neuman, Danzig & Miller [Page 1]
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian WellingtonRFC 1536 Common DNS Implementation Errors October 1993
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews retransmission policy, RFC 1035 does recommend that the resolvers
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews should cycle through a list of servers. Both name servers and stub
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington resolvers should, therefore, implement some kind of a retransmission
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews policy based on round trip time estimates of the name servers. The
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews client should back-off exponentially, probably to a maximum timeout
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews However, clients might not implement either of the two. They might
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews not wait a sufficient amount of time before retransmitting or they
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington might not back-off their inter-query times sufficiently.
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington Thus, what the server would see will be a series of queries from the
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington same querying entity, spaced very close together. Of course, a
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews correctly implemented server discards all duplicate queries but the
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews queries contribute to wide-area traffic, nevertheless.
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews We classify a retransmission of a query as a pure Fast retry timeout
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews problem when a series of query packets meet the following conditions.
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington a. Query packets are seen within a time less than a "reasonable
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews waiting period" of each other.
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews b. No response to the original query was seen i.e., we see two or
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington more queries, back to back.
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews c. The query packets share the same query identifier.
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews d. The server eventually responds to the query.
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian WellingtonA GOOD IMPLEMENTATION:
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington BIND (we looked at versions 4.8.3 and 4.9) implements a good
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington retransmission algorithm which solves or limits all of these
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington problems. The Berkeley stub-resolver queries servers at an interval
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley that starts at the greater of 4 seconds and 5 seconds divided by the
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley number of servers the resolver queries. The resolver cycles through
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley servers and at the end of a cycle, backs off the time out
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley exponentially.
c336121fb5a5c1b9cb9b2cd4cea50f0788270505Brian Wellington The Berkeley full-service resolver (built in with the program
202991557a4b7e8d3df7725d84f0fcae90dbaee6David Lawrence "named") starts with a time-out equal to the greater of 4 seconds and
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley two times the round-trip time estimate of the server. The time-out
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley is backed off with each cycle, exponentially, to a ceiling value of
077daa21229ffaedda79588fa70fbaeae19ae998Bob HalleyKumar, Postel, Neuman, Danzig & Miller [Page 2]
077daa21229ffaedda79588fa70fbaeae19ae998Bob HalleyRFC 1536 Common DNS Implementation Errors October 1993
1a69a1a78cfaa86f3b68bbc965232b7876d4da2aDavid Lawrence a. Estimate round-trip times or set a reasonably high initial
1d8cbe855fc355b80802dcf29f4ac24bebdd1193Brian Wellington b. Back-off timeout periods exponentially.
d1e4b08844175357a925ddd6dcfa750cccd2b116Brian Wellington c. Yet another fundamental though difficult fix is to send the
d1e4b08844175357a925ddd6dcfa750cccd2b116Brian Wellington client an acknowledgement of a query, with a round-trip time
da76a8046e01e1c1c2e6f75772afb2c4f202cc25Brian Wellington Since UDP is used, no response is expected by the client until the
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley query is complete. Thus, it is less likely to have information about
d1e4b08844175357a925ddd6dcfa750cccd2b116Brian Wellington previous packets on which to estimate its back-off time. Unless, you
f8727bd90366af835f551da1b5e1fdfcd2d3d01fBrian Wellington maintain state across queries, so subsequent queries to the same
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley server use information from previous queries. Unfortunately, such
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley estimates are likely to be inaccurate for chained requests since the
0f8f42a09eb102fa88e4d2caacdafbeda931f94cMark Andrews variance is likely to be high.
0f8f42a09eb102fa88e4d2caacdafbeda931f94cMark Andrews The fix chosen in the ARDP library used by Prospero is that the
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley server will send an initial acknowledgement to the client in those
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley cases where the server expects the query to take a long time (as
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley might be the case for chained queries). This initial acknowledgement
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley can include an expected time to wait before retrying.
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley This fix is more difficult since it requires that the client software
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley also be trained to expect the acknowledgement packet. This, in an
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews internet of millions of hosts is at best a hard problem.
77771185071bf74d53378f1a3099a04d2af5153eBrian Wellington2. Recursion Bugs
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley When a server receives a client request, it first looks up its zone
2f734e0a7e518c89c2b2b179714b8885b7626b3aAndreas Gustafsson data and the cache to check if the query can be answered. If the
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley answer is unavailable in either place, the server seeks names of
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley servers that are more likely to have the information, in its cache or
2f734e0a7e518c89c2b2b179714b8885b7626b3aAndreas Gustafsson zone data. It then does one of two things. If the client desires the
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley server to recurse and the server architecture allows recursion, the
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley server chains this request to these known servers closest to the
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley queried name. If the client doesn't seek recursion or if the server
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews cannot handle recursion, it returns the list of name servers to the
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews client assuming the client knows what to do with these records.
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews The client queries this new list of name servers to get either the
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley answer, or names of another set of name servers to query. This
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley process repeats until the client is satisfied. Servers might also go
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley through this chaining process if the server returns a CNAME record
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley for the queried name. Some servers reprocess this name to try and get
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley the desired record type.
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob HalleyKumar, Postel, Neuman, Danzig & Miller [Page 3]
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob HalleyRFC 1536 Common DNS Implementation Errors October 1993
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley However, in certain cases, this chain of events may not be good. For
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley example, a broken or malicious name server might list itself as one
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley of the name servers to query again. The unsuspecting client resends
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley the same query to the same server.
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley In another situation, more difficult to detect, a set of servers
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley might form a loop wherein A refers to B and B refers to A. This loop
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley might involve more than two servers.
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley Yet another error is where the client does not know how to process
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley the list of name servers returned, and requeries the same server
f8727bd90366af835f551da1b5e1fdfcd2d3d01fBrian Wellington since that is one (of the few) servers it knows.
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington We, therefore, classify recursion bugs into three distinct
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington a. Ignored referral: Client did not know how to handle NS records
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley in the AUTHORITY section.
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley b. Too many referrals: Client called on a server too many times,
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley beyond a "reasonable" number, with same query. This is
71954c957132c35ddf5f9e4dcc98c057b265b6d8Brian Wellington different from a Fast retransmission problem and a Server
71954c957132c35ddf5f9e4dcc98c057b265b6d8Brian Wellington Failure detection problem in that a response is seen for every
71954c957132c35ddf5f9e4dcc98c057b265b6d8Brian Wellington query. Also, the identifiers are always different. It implies
71954c957132c35ddf5f9e4dcc98c057b265b6d8Brian Wellington client is in a loop and should have detected that and broken
71954c957132c35ddf5f9e4dcc98c057b265b6d8Brian Wellington it. (RFC 1035 mentions that client should not recurse beyond
d119051ef75d5a88d28c13fb0a7c6d6757a4e9b5Brian Wellington a certain depth.)
d119051ef75d5a88d28c13fb0a7c6d6757a4e9b5Brian Wellington c. Malicious Server: a server refers to itself in the authority
d119051ef75d5a88d28c13fb0a7c6d6757a4e9b5Brian Wellington section. If a server does not have an answer now, it is very
d119051ef75d5a88d28c13fb0a7c6d6757a4e9b5Brian Wellington unlikely it will be any better the next time you query it,
d119051ef75d5a88d28c13fb0a7c6d6757a4e9b5Brian Wellington specially when it claims to be authoritative over a domain.
5e4b7294d88ab58371d8c98e05ea80086dcb67cdBob Halley RFC 1034 warns against such situations, on page 35.
108490a7f8529aff50a0ac7897580b59a73d9845David Lawrence "Bound the amount of work (packets sent, parallel processes
108490a7f8529aff50a0ac7897580b59a73d9845David Lawrence started) so that a request can't get into an infinite loop or
108490a7f8529aff50a0ac7897580b59a73d9845David Lawrence start off a chain reaction of requests or queries with other
5e4b7294d88ab58371d8c98e05ea80086dcb67cdBob Halley implementations EVEN IF SOMEONE HAS INCORRECTLY CONFIGURED
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob HalleyA GOOD IMPLEMENTATION:
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley BIND fixes at least one of these problems. It places an upper limit
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley on the number of recursive queries it will make, to answer a
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley question. It chases a maximum of 20 referral links and 8 canonical
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley name translations.
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob HalleyKumar, Postel, Neuman, Danzig & Miller [Page 4]
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob HalleyRFC 1536 Common DNS Implementation Errors October 1993
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley a. Set an upper limit on the number of referral links and CNAME
8fd925169f3d690f6c50c17d711adc9695407528Mark Andrews links you are willing to chase.
8fd925169f3d690f6c50c17d711adc9695407528Mark Andrews Note that this is not guaranteed to break only recursion loops.
8fd925169f3d690f6c50c17d711adc9695407528Mark Andrews It could, in a rare case, prune off a very long search path,
8fd925169f3d690f6c50c17d711adc9695407528Mark Andrews prematurely. We know, however, with high probability, that if
8fd925169f3d690f6c50c17d711adc9695407528Mark Andrews the number of links cross a certain metric (two times the depth
8fd925169f3d690f6c50c17d711adc9695407528Mark Andrews of the DNS tree), it is a recursion problem.
8fd925169f3d690f6c50c17d711adc9695407528Mark Andrews b. Watch out for self-referring servers. Avoid them whenever
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley c. Make sure you never pass off an authority NS record with your
f8727bd90366af835f551da1b5e1fdfcd2d3d01fBrian Wellington own name on it!
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley d. Fix clients to accept iterative answers from servers not built
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews to provide recursion. Such clients should either be happy with
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley the non-authoritative answer or be willing to chase the
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley referral links themselves.
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley3. Zero Answer Bugs:
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley Name servers sometimes return an authoritative NOERROR with no
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews ANSWER, AUTHORITY or ADDITIONAL records. This happens when the
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews queried name is valid but it does not have a record of the desired
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews type. Of course, the server has authority over the domain.
77771185071bf74d53378f1a3099a04d2af5153eBrian Wellington However, once again, some implementations of resolvers do not
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley interpret this kind of a response reasonably. They always expect an
77771185071bf74d53378f1a3099a04d2af5153eBrian Wellington answer record when they see an authoritative NOERROR. These entities
77771185071bf74d53378f1a3099a04d2af5153eBrian Wellington continue to resend their queries, possibly endlessly.
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob HalleyA GOOD IMPLEMENTATION
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley BIND resolver code does not query a server more than 3 times. If it
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley is unable to get an answer from 4 servers, querying them three times
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley each, it returns error.
1fc4929aa610263a2362afed516d7dc8e689397dBob Halley Of course, it treats a zero-answer response the way it should be
663841abe0bb1cc8040e552597ef721c35b799e5Brian Wellington treated; with respect!
1fc4929aa610263a2362afed516d7dc8e689397dBob Halley a. Set an upper limit on the number of retransmissions for a given
1fc4929aa610263a2362afed516d7dc8e689397dBob Halley query, at the very least.
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob HalleyKumar, Postel, Neuman, Danzig & Miller [Page 5]
663841abe0bb1cc8040e552597ef721c35b799e5Brian WellingtonRFC 1536 Common DNS Implementation Errors October 1993
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley b. Fix resolvers to interpret such a response as an authoritative
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley statement of non-existence of the record type for the given
9b2267b5ba9d0640512a41e139a4a36caa43730dBob Halley4. Inability to detect server failure:
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley Servers in the internet are not very reliable (they go down every
2aa67e804d85f4d88153368ce65ce4df7b5390e6Bob Halley once in a while) and resolvers are expected to adapt to the changed
2aa67e804d85f4d88153368ce65ce4df7b5390e6Bob Halley scenario by not querying the server for a while. Thus, when a server
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews does not respond to a query, resolvers should try another server.
2aa67e804d85f4d88153368ce65ce4df7b5390e6Bob Halley Also, non-stub resolvers should update their round trip time estimate
2aa67e804d85f4d88153368ce65ce4df7b5390e6Bob Halley for the server to a large value so that server is not tried again
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews before other, faster servers.
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews Stub resolvers, however, cycle through a fixed set of servers and if,
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews unfortunately, a server is down while others do not respond for other
2aa67e804d85f4d88153368ce65ce4df7b5390e6Bob Halley reasons (high load, recursive resolution of query is taking more time
ed0b018ee06295f5fa8c45412486d40f219f2fefMichael Graff than the resolver's time-out, ....), the resolver queries the dead
2aa67e804d85f4d88153368ce65ce4df7b5390e6Bob Halley server again! In fact, some resolvers might not set an upper limit on
2aa67e804d85f4d88153368ce65ce4df7b5390e6Bob Halley the number of query retransmissions they will send and continue to
2aa67e804d85f4d88153368ce65ce4df7b5390e6Bob Halley query dead servers indefinitely.
2aa67e804d85f4d88153368ce65ce4df7b5390e6Bob Halley Name servers running system or chained queries might also suffer from
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley the same problem. They store names of servers they should query for a
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley given domain. They cycle through these names and in case none of them
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley answers, hit each one more than one. It is, once again, important
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley that there be an upper limit on the number of retransmissions, to
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews prevent network overload.
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews This behavior is clearly in violation of the dictum in RFC 1035 (page
2aa67e804d85f4d88153368ce65ce4df7b5390e6Bob Halley "If a resolver gets a server error or other bizarre response
2aa67e804d85f4d88153368ce65ce4df7b5390e6Bob Halley from a name server, it should remove it from SLIST, and may
2aa67e804d85f4d88153368ce65ce4df7b5390e6Bob Halley wish to schedule an immediate transmission to the next
15197aefa1659e98ea1c48e2cbae631136a072b7Michael Graff candidate server address."
15197aefa1659e98ea1c48e2cbae631136a072b7Michael Graff Removal from SLIST implies that the server is not queried again for
15197aefa1659e98ea1c48e2cbae631136a072b7Michael Graff Correctly implemented full-service resolvers should, as pointed out
2aa67e804d85f4d88153368ce65ce4df7b5390e6Bob Halley before, update round trip time values for servers that do not respond
c52ae25dd70636c673d4a299859137a1c8ba611aMark Andrews and query them only after other, good servers. Full-service resolvers
c52ae25dd70636c673d4a299859137a1c8ba611aMark Andrews might, however, not follow any of these common sense directives. They
c52ae25dd70636c673d4a299859137a1c8ba611aMark Andrews query dead servers, and they query them endlessly.
c52ae25dd70636c673d4a299859137a1c8ba611aMark AndrewsKumar, Postel, Neuman, Danzig & Miller [Page 6]
c52ae25dd70636c673d4a299859137a1c8ba611aMark AndrewsRFC 1536 Common DNS Implementation Errors October 1993
c52ae25dd70636c673d4a299859137a1c8ba611aMark AndrewsA GOOD IMPLEMENTATION:
c52ae25dd70636c673d4a299859137a1c8ba611aMark Andrews BIND places an upper limit on the number of times it queries a
c52ae25dd70636c673d4a299859137a1c8ba611aMark Andrews server. Both the stub-resolver and the full-service resolver code do
c52ae25dd70636c673d4a299859137a1c8ba611aMark Andrews this. Also, since the full-service resolver estimates round-trip
c52ae25dd70636c673d4a299859137a1c8ba611aMark Andrews times and sorts name server addresses by these estimates, it does not
c52ae25dd70636c673d4a299859137a1c8ba611aMark Andrews query a dead server again, until and unless all the other servers in
c52ae25dd70636c673d4a299859137a1c8ba611aMark Andrews the list are dead too! Further, BIND implements exponential back-off
c52ae25dd70636c673d4a299859137a1c8ba611aMark Andrews a. Set an upper limit on number of retransmissions.
2aa67e804d85f4d88153368ce65ce4df7b5390e6Bob Halley b. Measure round-trip time from servers (some estimate is better
2aa67e804d85f4d88153368ce65ce4df7b5390e6Bob Halley than none). Treat no response as a "very large" round-trip
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley c. Maintain a weighted rtt estimate and decay the "large" value
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley slowly, with time, so that the server is eventually tested
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley again, but not after an indefinitely long period.
e63f7c6f556aef66ff81fb128605f9eadf1ddcd9Mark Andrews d. Follow an exponential back-off scheme so that even if you do
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley not restrict the number of queries, you do not overload the
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley net excessively.
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley5. Cache Leaks:
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley Every resource record returned by a server is cached for TTL seconds,
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley where the TTL value is returned with the RR. Full-service (or stub)
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley resolvers cache the RR and answer any queries based on this cached
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews information, in the future, until the TTL expires. After that, one
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley more query to the wide-area network gets the RR in cache again.
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews Full-service resolvers might not implement this caching mechanism
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews well. They might impose a limit on the cache size or might not
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews interpret the TTL value correctly. In either case, queries repeated
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews within a TTL period of a RR constitute a cache leak.
4b87939256ede703385e9cab92d3c58d03c31098Mark AndrewsA GOOD/BAD IMPLEMENTATION:
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley BIND has no restriction on the cache size and the size is governed by
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley the limits on the virtual address space of the machine it is running
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley on. BIND caches RRs for the duration of the TTL returned with each
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews It does, however, not follow the RFCs with respect to interpretation
c336121fb5a5c1b9cb9b2cd4cea50f0788270505Brian Wellington of a 0 TTL value. If a record has a TTL value of 0 seconds, BIND uses
4e5388b45908ce8b8b35825ca6f16c1d236643baBrian WellingtonKumar, Postel, Neuman, Danzig & Miller [Page 7]
4e5388b45908ce8b8b35825ca6f16c1d236643baBrian WellingtonRFC 1536 Common DNS Implementation Errors October 1993
4e5388b45908ce8b8b35825ca6f16c1d236643baBrian Wellington the minimum TTL value, for that zone, from the SOA record and caches
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley it for that duration. This, though it saves some traffic on the
71954c957132c35ddf5f9e4dcc98c057b265b6d8Brian Wellington wide-area network, is not correct behavior.
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews a. Look over your caching mechanism to ensure TTLs are interpreted
71954c957132c35ddf5f9e4dcc98c057b265b6d8Brian Wellington b. Do not restrict cache sizes (come on, memory is cheap!).
202991557a4b7e8d3df7725d84f0fcae90dbaee6David Lawrence Expired entries are reclaimed periodically, anyway. Of course,
202991557a4b7e8d3df7725d84f0fcae90dbaee6David Lawrence the cache size is bound to have some physical limit. But, when
202991557a4b7e8d3df7725d84f0fcae90dbaee6David Lawrence possible, this limit should be large (run your name server on
202991557a4b7e8d3df7725d84f0fcae90dbaee6David Lawrence a machine with a large amount of physical memory).
c336121fb5a5c1b9cb9b2cd4cea50f0788270505Brian Wellington c. Possibly, a mechanism is needed to flush the cache, when it is
c6ab6ca3fd5f0bca400aac931f616722bbb19109Andreas Gustafsson known or even suspected that the information has changed.
c336121fb5a5c1b9cb9b2cd4cea50f0788270505Brian Wellington6. Name Error Bugs:
c6ab6ca3fd5f0bca400aac931f616722bbb19109Andreas Gustafsson This bug is very similar to the Zero Answer bug. A server returns an
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington authoritative NXDOMAIN when the queried name is known to be bad, by
c6ab6ca3fd5f0bca400aac931f616722bbb19109Andreas Gustafsson the server authoritative for the domain, in the absence of negative
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington caching. This authoritative NXDOMAIN response is usually accompanied
c6ab6ca3fd5f0bca400aac931f616722bbb19109Andreas Gustafsson by the SOA record for the domain, in the authority section.
55839cbe314c61e40b29b81a7de7e7aaf7163a10Brian Wellington Resolvers should recognize that the name they queried for was a bad
40f53fa8d9c6a4fc38c0014495e7a42b08f52481David Lawrence name and should stop querying further.
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews Some resolvers might, however, not interpret this correctly and
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews continue to query servers, expecting an answer record.
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews Some applications, in fact, prompt NXDOMAIN answers! When given a
4e5388b45908ce8b8b35825ca6f16c1d236643baBrian Wellington perfectly good name to resolve, they append the local domain to it
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews e.g., an application in the domain "foo.bar.com", when trying to
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews resolve the name "usc.edu" first tries "usc.edu.foo.bar.com", then
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews "usc.edu.bar.com" and finally the good name "usc.edu". This causes at
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews least two queries that return NXDOMAIN, for every good query. The
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington problem is aggravated since the negative answers from the previous
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington queries are not cached. When the same name is sought again, the
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington process repeats.
2dfd6bca9aa6d9279b4278d6fa18ea5f63ba0ec9Bob Halley Some DNS resolver implementations suffer from this problem, too. They
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley append successive sub-parts of the local domain using an implicit
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley searchlist mechanism, when certain conditions are satisfied and try
202991557a4b7e8d3df7725d84f0fcae90dbaee6David Lawrence the original name, only when this first set of iterations fails. This
f8727bd90366af835f551da1b5e1fdfcd2d3d01fBrian Wellington behavior recently caused pandemonium in the Internet when the domain
c6ab6ca3fd5f0bca400aac931f616722bbb19109Andreas Gustafsson "edu.com" was registered and a wildcard "CNAME" record placed at the
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian WellingtonKumar, Postel, Neuman, Danzig & Miller [Page 8]
202991557a4b7e8d3df7725d84f0fcae90dbaee6David LawrenceRFC 1536 Common DNS Implementation Errors October 1993
51e0ad287f1b345f0c3316f0633aab14d0e8bb65Brian Wellington top level. All machines from "com" domains trying to connect to hosts
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley in the "edu" domain ended up with connections to the local machine in
134ba0e08a0ae9a564a8d8628fc633377d3fc239Bob Halley the "edu.com" domain!
a3a11c4f3fc9ba972802b811c4d95a9884d6ff4aMichael SawyerGOOD/BAD IMPLEMENTATIONS:
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews Some local versions of BIND already implement negative caching. They
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley typically cache negative answers with a very small TTL, sufficient to
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews answer a burst of queries spaced close together, as is typically
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews The next official public release of BIND (4.9.2) will have negative
c9e698df1b2f3731577eaf9598ed3845eac67e1bBrian Wellington caching as an ifdef'd feature.
c9e698df1b2f3731577eaf9598ed3845eac67e1bBrian Wellington The BIND resolver appends local domain to the given name, when one of
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley two conditions is met:
a3a11c4f3fc9ba972802b811c4d95a9884d6ff4aMichael Sawyer i. The name has no periods and the flag RES_DEFNAME is set.
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews ii. There is no trailing period and the flag RES_DNSRCH is set.
a3a11c4f3fc9ba972802b811c4d95a9884d6ff4aMichael Sawyer The flags RES_DEFNAME and RES_DNSRCH are default resolver options, in
a3a11c4f3fc9ba972802b811c4d95a9884d6ff4aMichael Sawyer BIND, but can be changed at compile time.
a3a11c4f3fc9ba972802b811c4d95a9884d6ff4aMichael Sawyer Only if the name, so generated, returns an NXDOMAIN is the original
a3a11c4f3fc9ba972802b811c4d95a9884d6ff4aMichael Sawyer name tried as a Fully Qualified Domain Name. And only if it contains
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley at least one period.
a3a11c4f3fc9ba972802b811c4d95a9884d6ff4aMichael Sawyer a. Fix the resolver code.
a3a11c4f3fc9ba972802b811c4d95a9884d6ff4aMichael Sawyer b. Negative Caching. Negative caching servers will restrict the
a3a11c4f3fc9ba972802b811c4d95a9884d6ff4aMichael Sawyer traffic seen on the wide-area network, even if not curb it
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley c. Applications and resolvers should not append the local domain to
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley names they seek to resolve, as far as possible. Names
93e6ebcd0a0f044ba2add424c265b5e0bb4c8afdMark Andrews interspersed with periods should be treated as Fully Qualified
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews Domain Names.
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley In other words, Use searchlists only when explicitly specified.
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley No implicit searchlists should be used. A name that contains
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews any dots should first be tried as a FQDN and if that fails, with
93e6ebcd0a0f044ba2add424c265b5e0bb4c8afdMark Andrews the local domain name (or searchlist if specified) appended. A
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley name containing no dots can be appended with the searchlist right
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley away, but once again, no implicit searchlists should be used.
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob HalleyKumar, Postel, Neuman, Danzig & Miller [Page 9]
8d2b885018e8c8565a8fea56cc01405c93a72aaeAndreas GustafssonRFC 1536 Common DNS Implementation Errors October 1993
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley Associated with the name error bug is another problem where a server
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley might return an authoritative NXDOMAIN, although the name is valid. A
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews secondary server, on start-up, reads the zone information from the
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews primary, through a zone transfer. While it is in the process of
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews loading the zones, it does not have information about them, although
93e6ebcd0a0f044ba2add424c265b5e0bb4c8afdMark Andrews it is authoritative for them. Thus, any query for a name in that
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews domain is answered with an NXDOMAIN response code. This problem might
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley not be disastrous were it not for negative caching servers that cache
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley this answer and so propagate incorrect information over the internet.
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark AndrewsBAD IMPLEMENTATION:
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews BIND apparently suffers from this problem.
93e6ebcd0a0f044ba2add424c265b5e0bb4c8afdMark Andrews Also, a new name added to the primary database will take a while to
93e6ebcd0a0f044ba2add424c265b5e0bb4c8afdMark Andrews propagate to the secondaries. Until that time, they will return
93e6ebcd0a0f044ba2add424c265b5e0bb4c8afdMark Andrews NXDOMAIN answers for a good name. Negative caching servers store this
4b87939256ede703385e9cab92d3c58d03c31098Mark Andrews answer, too and aggravate this problem further. This is probably a
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews more general DNS problem but is apparently more harmful in this
93e6ebcd0a0f044ba2add424c265b5e0bb4c8afdMark Andrews a. Servers should start answering only after loading all the zone
93e6ebcd0a0f044ba2add424c265b5e0bb4c8afdMark Andrews data. A failed server is better than a server handing out
93e6ebcd0a0f044ba2add424c265b5e0bb4c8afdMark Andrews incorrect information.
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley b. Negative cache records for a very small time, sufficient only
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley to ward off a burst of requests for the same bad name. This
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley could be related to the round-trip time of the server from
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley which the negative answer was received. Alternatively, a
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob Halley statistical measure of the amount of time for which queries
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews for such names are received could be used. Minimum TTL value
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews from the SOA record is not advisable since they tend to be
93e6ebcd0a0f044ba2add424c265b5e0bb4c8afdMark Andrews pretty large.
93e6ebcd0a0f044ba2add424c265b5e0bb4c8afdMark Andrews c. A "PUSH" (or, at least, a "NOTIFY") mechanism should be allowed
93e6ebcd0a0f044ba2add424c265b5e0bb4c8afdMark Andrews and implemented, to allow the primary server to inform
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews secondaries that the database has been modified since it last
5d98cf67b32d785aca1a72ea1dc4d559fab39208Mark Andrews transferred zone data. To alleviate the problem of "too many
93e6ebcd0a0f044ba2add424c265b5e0bb4c8afdMark Andrews zone transfers" that this might cause, Incremental Zone
93e6ebcd0a0f044ba2add424c265b5e0bb4c8afdMark Andrews Transfers should also be part of DNS. Also, the primary should
93e6ebcd0a0f044ba2add424c265b5e0bb4c8afdMark Andrews not NOTIFY/PUSH with every update but bunch a good number
9ee5efde7df57cbe70fb9b32c9d898e8ef7eca1eBob HalleyKumar, Postel, Neuman, Danzig & Miller [Page 10]
df7596a03eea7f1c2df89bd63d3bd4b73f274565Mark AndrewsRFC 1536 Common DNS Implementation Errors October 1993
df7596a03eea7f1c2df89bd63d3bd4b73f274565Mark Andrews7. Format Errors:
df7596a03eea7f1c2df89bd63d3bd4b73f274565Mark Andrews Some resolvers issue query packets that do not necessarily conform to
df7596a03eea7f1c2df89bd63d3bd4b73f274565Mark Andrews standards as laid out in the relevant RFCs. This unnecessarily
df7596a03eea7f1c2df89bd63d3bd4b73f274565Mark Andrews increases net traffic and wastes server time.
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews a. Fix resolvers.
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews b. Each resolver verify format of packets before sending them out,
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews using a mechanism outside of the resolver. This is, obviously,
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews needed only if step 1 cannot be followed.
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews [1] Mockapetris, P., "Domain Names Concepts and Facilities", STD 13,
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews RFC 1034, USC/Information Sciences Institute, November 1987.
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews [2] Mockapetris, P., "Domain Names Implementation and Specification",
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews STD 13, RFC 1035, USC/Information Sciences Institute, November
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews [3] Partridge, C., "Mail Routing and the Domain System", STD 14, RFC
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews 974, CSNET CIC BBN, January 1986.
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews [4] Gavron, E., "A Security Problem and Proposed Correction With
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews Widely Deployed DNS Software", RFC 1535, ACES Research Inc.,
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews October 1993.
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews [5] Beertema, P., "Common DNS Data File Configuration Errors", RFC
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews 1537, CWI, October 1993.
cf300e03de3df3ff422db922520bf07c686c86daMark AndrewsSecurity Considerations
cf300e03de3df3ff422db922520bf07c686c86daMark Andrews Security issues are not discussed in this memo.
6286983c506433d642b23e64845c50be30f2a7f6Mark AndrewsKumar, Postel, Neuman, Danzig & Miller [Page 11]
6286983c506433d642b23e64845c50be30f2a7f6Mark AndrewsRFC 1536 Common DNS Implementation Errors October 1993
6286983c506433d642b23e64845c50be30f2a7f6Mark AndrewsAuthors' Addresses