/*
 * CDDL HEADER START
 *
 * The contents of this file are subject to the terms of the
 * Common Development and Distribution License, Version 1.0 only
 * (the "License").  You may not use this file except in compliance
 * with the License.
 *
 * You can obtain a copy of the license at
 * trunk/opends/resource/legal-notices/OpenDS.LICENSE
 * or https://OpenDS.dev.java.net/OpenDS.LICENSE.
 * See the License for the specific language governing permissions
 * and limitations under the License.
 *
 * When distributing Covered Code, include this CDDL HEADER in each
 * file and include the License file at
 * trunk/opends/resource/legal-notices/OpenDS.LICENSE.  If applicable,
 * add the following below this CDDL HEADER, with the fields enclosed
 * by brackets "[]" replaced with your own identifying information:
 *      Portions Copyright [yyyy] [name of copyright owner]
 *
 * CDDL HEADER END
 *
 *
 *      Copyright 2006-2008 Sun Microsystems, Inc.
 */


/**
 * Contains the code for the Directory Server backend that uses the Berkeley DB
 * Java Edition as the repository for storing entry and index information.
 * <BR><BR>
 *
 * <H3>On-disk Representation</H3>
 * <P>
 * First it is important to understand JE (Java Edition) terminology.  A JE
 * database environment has similarities to a database in the relational
 * database world. Each environment can have multiple databases, which are
 * similar to tables in a relational database.  A JE environment is identified
 * by the file system directory in which it is stored.  A JE database is
 * identified by a unique name within its environment.  Multiple databases in
 * the same environment may be updated within the same transaction, but
 * transactions cannot span environments.
 * <P>
 * In this description, database means a JE database.
 * <P>
 * Each instance of this backend creates a single JE environment to store its
 * data.  Unlike previous versions of Directory Server, environments are not
 * shared by backend instances.  The backend does support multiple base DNs,
 * so it is still possible for data under multiple suffixes to share the same
 * database environment, by declaring those suffixes as base DNs of a single
 * JE backend instance.
 * <P>
 * The data for a base DN is kept in a set of databases, so that a database
 * contains data for only one base DN.  Each database name is prefixed by
 * the base DN it belongs to, where the DN is simplified by preserving only
 * letters and digits.
 * <P>
 * For example, if you were to use the DbDump utility to list the databases
 * in the environment corresponding to a backend instance containing the base
 * DN dc=example,dc=com, you might see the following:
 * <pre>
 * dc_example_dc_com_cn.equality
 * dc_example_dc_com_cn.presence
 * dc_example_dc_com_cn.substring
 * dc_example_dc_com_dn2id
 * dc_example_dc_com_givenName.equality
 * dc_example_dc_com_givenName.presence
 * dc_example_dc_com_givenName.substring
 * dc_example_dc_com_id2children
 * dc_example_dc_com_id2entry
 * dc_example_dc_com_id2subtree
 * dc_example_dc_com_mail.equality
 * dc_example_dc_com_mail.presence
 * dc_example_dc_com_mail.substring
 * dc_example_dc_com_member.equality
 * dc_example_dc_com_sn.equality
 * dc_example_dc_com_sn.presence
 * dc_example_dc_com_sn.substring
 * dc_example_dc_com_telephoneNumber.equality
 * dc_example_dc_com_telephoneNumber.presence
 * dc_example_dc_com_telephoneNumber.substring
 * dc_example_dc_com_uid.equality
 * </pre>
 * <H4>Database Relocation</H4>
 * <P>
 * The data is stored in a format which is independent of system architecture,
 * and is also independent of file system location because it contains no
 * pathnames.  The backend, and its backups, can be copied, moved and restored
 * to a different location, within the same system or a different system.
 * <P>
 * <H4>The Entry ID</H4>
 * <P>
 * Each entry to be stored in the backend is assigned a 64-bit integer
 * identifier called the entry ID.  The first entry to be created is entry ID 1,
 * the second is entry ID 2, etc.  This ensures that the ID for any given entry
 * is always greater than its superiors.  The backend takes care to preserve
 * this invariant, in particular during Modify DN operations where an entry
 * can be given a new superior.  Clients have come to expect child entries to
 * be returned after their parent in search results, and the backend can ensure
 * this by returning entries in ID order.
 * <P>
 * On disk, an entry ID is stored in eight bytes in big-endian format (from
 * most significant byte to least significant byte).  This enables binary
 * copy of the backend from one system to another, regardless of the system
 * architecture.
 * <P>
 * Currently, IDs of deleted entries are not reused.  The use of a 64-bit
 * integer means it is implausible that the entry ID space will be exhausted.
 * <P>
 * <P>
 * <H4>The entry database (id2entry)</H4>
 * <P>
 * Entries are stored in the id2entry database.  The key to the database is
 * the entry ID, and the value is an ASN.1 encoding of the entry contents.
 * The default JE btree key comparator is used for the entry database,
 * such that cursoring through the database will return entries in order of
 * entry ID.  When the backend starts it is able to determine the last
 * assigned entry ID by reading the last key value in the entry database.
 * <P>
 * The format of the entry on disk is described by the following ASN.1.
 * <P>
 * <pre>
 * DatabaseEntry ::= [APPLICATION 0] IMPLICIT SEQUENCE {
 *  uncompressedSize        INTEGER,      -- A zero value means not compressed.
 *  dataBytes               OCTET STRING  -- Optionally compressed encoding of
 *                                           the data bytes.
 * }
 *
 * ID2EntryValue ::= DatabaseEntry
 *  -- Where dataBytes contains an encoding of DirectoryServerEntry.
 *
 * DirectoryServerEntry ::= [APPLICATION 1] IMPLICIT SEQUENCE {
 *  dn                      LDAPDN,
 *  objectClasses           SET OF LDAPString,
 *  userAttributes          AttributeList,
 *  operationalAttributes   AttributeList
 * }
 * </pre>
 * <P>
 * Entry compression is optional and can be switched on or off at any time.
 * Switching on entry compression only affects future writes, therefore the
 * database can contain a mixture of compressed and not-compressed records.
 * Either record type can be read regardless of the configuration setting.
 * The compression algorithm is the default ZLIB implementation provided by the
 * Java platform.
 * <P>
 * The ASN1 types have application tags to allow for future extensions.
 * The types may be extended with additional fields where this makes sense,
 * or additional types may be defined.
 * <P>
 * <H5>The entry count record</H5>
 * <P>
 * Previous versions of Directory Server provide the current number of entries
 * stored in the backend.  JE does not maintain database record counts,
 * requiring a full key traversal to count the number of records in a database,
 * which is too time consuming for large numbers of entries.
 * <P>
 * For this reason the backend maintains its own count of the number of
 * entries in the entry database, storing this count in the special record
 * whose key is entry ID zero.
 * <P>
 * <P>
 * <H4>The DN database (dn2id)</H4>
 * <P>
 * Although each entry's DN is stored in the entry database, we need to be
 * able to retrieve entries by DN.  The dn2id database key is the normalized
 * DN and the value is the entry ID corresponding to the DN.  A normalized DN
 * is one which may be compared for equality with another using a standard
 * string comparison function.  A given DN can have numerous string
 * representations, due to insignificant whitespace, or insignificant case of
 * attribute names, etc., but it has only one normalized form.  Use of the
 * normalized form enables efficient key comparison.
 * <P>
 * A custom btree key comparator is applied to the DN database, which orders
 * the keys such that a given entry DN comes after the DNs of its superiors,
 * and ensures that the DNs below a given base DN are contiguous.  This
 * ordering is used to return entries for a non-indexed subtree or
 * single level search.  The comparator is just like the default lexicographic
 * comparator except that it compares in reverse byte order.
 * <P>
 * For example, a cursor iteration through a range of the DN database might
 * look like this:
 * <pre>
 * dc=example,dc=com
 * ou=people,dc=example,dc=com
 * uid=user.1000,ou=people,dc=example,dc=com
 * uid=user.2000,ou=people,dc=example,dc=com
 * uid=user.3000,ou=people,dc=example,dc=com
 * uid=user.4000,ou=people,dc=example,dc=com
 * uid=user.100,ou=people,dc=example,dc=com
 * uid=user.1100,ou=people,dc=example,dc=com
 * uid=user.2100,ou=people,dc=example,dc=com
 * </pre>
 * <P>
 * At first, it may seem strange that user.1100 comes after user.1000 but it
 * becomes clear when considering the values in reverse byte order, since
 * 0011.resu is indeed greater than 0001.resu.
 * <P>
 * <H4>Index Databases</H4>
 * <P>
 * Index databases are used to efficiently process search requests.  The system
 * indexes, id2children and id2subtree, are dedicated to processing one-level
 * and subtree search scope respectively.  Then there are configurable
 * attribute indexes to process components of a search filter.  Each index
 * record maps a key to an Entry ID List.
 * <P>
 * <P>
 * <H5>Entry ID List</H5>
 * <P>
 * An entry ID list is a set of entry IDs, arranged in order of ID.  On disk,
 * the list is a concatenation of the 8-byte entry ID values, where the first
 * ID is the lowest.  The number of IDs in the list can be obtained by dividing
 * the total number of bytes by eight.
 * <P>
 * <P>
 * <H5>Index Entry Limit</H5>
 * <P>
 * In some cases, the number of entries indexed by a given key is so large
 * that the cost of maintaining the list during entry updates outweighs the
 * benefit of the list during search processing.  Each index therefore has
 * a configurable entry limit.  Whenever a list reaches the entry limit, it is
 * replaced with a zero length value to indicate that the list is no longer
 * maintained.
 * <P>
 * <P>
 * <H5>Children Index (id2children)</H5>
 * <P>
 * The children index is a system index which maps the ID of any non-leaf entry
 * to entry IDs of the immediate children of the entry. This index is used to
 * get the set of entries within the scope of a one-level search.
 * <P>
 * <P>
 * <H5>Subtree Index (id2subtree)</H5>
 * <P>
 * The subtree index is a system index which maps the ID of any non-leaf entry
 * to entry IDs of all descendants of the entry. This index is used to get the
 * set of entries within the scope of a subtree search.
 * <P>
 * <P>
 * <H5>Attribute Equality Index</H5>
 * <P>
 * An attribute equality index maps the value of an attribute to entry IDs of
 * all entries containing that attribute value. The database key is the
 * attribute value after it has been normalized by the equality matching rule
 * for that attribute.  This index is used to get the set of entries matching
 * an equality filter.
 * <P>
 * <P>
 * <H5>Attribute Presence Index</H5>
 * <P>
 * An attribute presence index contains a single record which has entry IDs
 * of all entries containing a value of the attribute. This index is used to get
 * the set of entries matching an attribute presence filter.
 * <P>
 * <P>
 * <H5>Attribute Substring Index</H5>
 * <P>
 * An attribute substring index maps a substring of an attribute value to entry
 * IDs of all entries containing that substring in one or more of its values of
 * the attribute.  This index is used to get a set of entries that are
 * candidates for matching a subtring filter.
 * <P>
 * The length of substrings in the index is configurable.  For example, let's
 * say the configured substring length is three, and there is an entry
 * containing the attribute value ABCDE.  The ID for this entry would be
 * indexed by the keys ABC BCD CDE DE E.  To find entries containing a short
 * substring such as DE, iterate through all keys with prefix DE.  To find
 * entries containing a longer substring such as BCDE, read keys BCD and CDE.
 * <P>
 * <P>
 * <H5>Attribute Ordering Index</H5>
 * <P>
 * An attribute ordering index is similar to an equality index in that it maps
 * the value of an attribute to entry IDs of all entries containing that
 * attribute value.  However, the values are normalized by the ordering matching
 * rule for the attribute rather than the equality matching rule, and the
 * btree key comparator is set to the ordering matching rule comparator.  This
 * index is used to get the set of entries matching inequality filters
 * (less-than-or-equal, greater-than-or-equal).
 *
 *
 */
@org.opends.server.types.PublicAPI(
     stability=org.opends.server.types.StabilityLevel.PRIVATE)
package org.opends.server.backends.jeb;