database revision 50a266dd24a1f6ff8589790b9923ef79bd1896e4
661N/A
661N/A
661N/ADatabases
661N/A
661N/ABIND 9 DNS database allows named rdatasets to be stored and retrieved.
661N/ADNS databases are used to store two different categories of data:
661N/Aauthoritative zone data and non-authoritative cache data.Unlike
661N/Aprevious versions of BIND which used a monolithic database, BIND 9 has
661N/Aone database per zone or cache. Certain database operations, for
661N/Aexample updates, have differing requirements and actions depending
661N/Aupon whether the database contains zone data or cache data.
661N/A
661N/A
661N/ADatabase Updates
661N/A
661N/AA master zone is updated by a Dynamic Update message. A slave zone is
661N/Aupdated by IXFR or AXFR. AXFR provides the entire contents of the new
661N/Azone version, and replaces the entire contents of the database. IXFR
661N/Aand Dynamic Update, although completely different protocols, have the
661N/Asame basic database requirements. They are differential update
661N/Aprotocols, e.g. "add this record to the records at name 'foo'". The
661N/Aupdates are also atomic, i.e. they must either succeed or fail.
661N/AChanges must not become visible to clients until the update has
661N/Acommitted. In short, zone updates are transactional.
661N/A
661N/ACache updates are done by the server in the ordinary course of
661N/Ahandling client requests. Unlike zone updates, cache updates do not
661N/Arefer to the current contents of the cache, so concurrent writing to
661N/Athe cache is possible. The main requirement is that concurrent update
661N/Aattempts to the same node and rdataset type must appear to have been
661N/Aexecuted in some order. In order to make DB versioning simpler, the DB
661N/Ainterface actually imposes a more restrictive set of requirements, namely
661N/Athat access to a node is serialized and that database changes will become
661N/Avisible in version order (more on this below).
661N/A
661N/A
661N/ADatabase Concurrency and Locking
661N/A
661N/AA principle goal of the BIND 9 project is multiprocessor scalabilty.
661N/AThe amount of concurrency in database accesses is an important factor
661N/Ain achieving scalability. Consider a heavily used database, e.g. the
661N/Acache database serving some mail hubs, or ".com". If access to these
661N/Adatabases is not parallalized, then adding another CPU will not help
661N/Athe server's performance for the portion of the runtime spent in
661N/Adatabase lookup.
661N/A
661N/ASupport for multiple concurrent readers certainly helps both cache
661N/Adatabases and zone databases. Zones are typically read much more than
661N/Athey are written, though less so than in prior years because dynamic
661N/ADNS support is now widely available. Caches are frequently written as
661N/Awell as read; a non-scientific survey of caching statistics on a few
661N/Abusy caching nameservers showed the ratio of cache hits to misses was
661N/Aabout 2 to 1.
661N/A
661N/AAs mentioned above, zone updates must be serialized, but cache updates
661N/Aoften provide good opportunities for concurrency.
661N/A
661N/AA simple approach to these concurrency goals would be to have a single
661N/Aread-write lock on the database. This would allow for multiple
661N/Aconcurrent readers, and would provide the serialization of updates
661N/Athat zone updates require. This approach also has significant
661N/Alimitations. Readers cannot run while an update is running. For a
661N/Ashort-lived transaction like a Dynamic Update, this may be acceptable,
661N/Abut an IXFR can take a very long time (even hours) to complete.
661N/APreventing read access for such a long time is unacceptable. Another
661N/Aproblem is that it forces updates to be serialized, even for cache
661N/Adatabases. There are problems on the reader side of the lock too. If
661N/Athe entire database is protected by one lock, then any data retrieved
661N/Afrom the database must either be used while the lock is held, or it
661N/Amust be copied, because the data in the database can change when the
661N/Alock isn't held. Copying is expensive, and the server would like to
661N/Abe able to hold a reference to database data for a long time. The
661N/Amost significant long-running reader problem is outbound AXFR, which
661N/Acould potentially block updates for a very long time (hours).
661N/A
661N/AA finer-grained locking scheme, e.g. one lock per node, helps
661N/Aparallelize cache updates, but doesn't help with the long-lived reader
661N/Aor long-lived writer problems.
661N/A
661N/A
661N/ADatabase Versioning
661N/A
661N/AXXX TBS XXX
661N/A