chap-troubleshooting.xml revision 51607ea01068c9047391e4c8b46bc9dbd0edb7fd
<?xml version="1.0" encoding="UTF-8"?>
<!--
! CCPL HEADER START
!
! This work is licensed under the Creative Commons
! Attribution-NonCommercial-NoDerivs 3.0 Unported License.
! To view a copy of this license, visit
! http://creativecommons.org/licenses/by-nc-nd/3.0/
! or send a letter to Creative Commons, 444 Castro Street,
! Suite 900, Mountain View, California, 94041, USA.
!
! You can also obtain a copy of the license at
! trunk/opendj3/legal-notices/CC-BY-NC-ND.txt.
! See the License for the specific language governing permissions
! and limitations under the License.
!
! If applicable, add the following below this CCPL HEADER, with the fields
! enclosed by brackets "[]" replaced with your own identifying information:
! Portions Copyright [yyyy] [name of copyright owner]
!
! CCPL HEADER END
!
! Copyright 2011-2013 ForgeRock AS
!
-->
<chapter xml:id='chap-troubleshooting'
xmlns='http://docbook.org/ns/docbook'
version='5.0' xml:lang='en'
xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
xsi:schemaLocation='http://docbook.org/ns/docbook http://docbook.org/xml/5.0/xsd/docbook.xsd'
xmlns:xlink='http://www.w3.org/1999/xlink'
>
<title>Troubleshooting Server Problems</title>
<indexterm><primary>Troubleshooting</primary></indexterm>
<para>This chapter describes how to troubleshoot common server problems,
and how to collect information necessary when seeking support help.</para>
<section xml:id="troubleshoot-identify-problem">
<title>Identifying the Problem</title>
<para>In order to solve your problem methodically, save time by defining the
problem clearly up front. In a replicated environment with multiple directory
servers and many client applications, it can be particularly important to
pin down not only the problem (difference in observed behavior compared to
expected behavior), but also the circumstances and steps that lead to the
problem occurring.</para>
<itemizedlist>
<para>Answer the following questions.</para>
<listitem>
<para>How do you reproduce the problem?</para>
</listitem>
<listitem>
<para>What exactly is the problem? In other words, what is the behavior
you expected? What is the behavior you observed?</para>
</listitem>
<listitem>
<para>When did the problem start occurring? Under similar circumstances,
when does the problem not occur?</para>
</listitem>
<listitem>
<para>Is the problem permanent? Intermittent? Is it getting worse?
Getting better? Staying the same?</para>
</listitem>
</itemizedlist>
<para>Pinpointing the problem can sometimes indicate where you should
start looking for solutions.</para>
</section>
<section xml:id="troubleshoot-installation">
<title>Troubleshooting Installation &amp; Upgrade</title>
<para>Installation and upgrade procedures result in a log file tracing
the operation. The log location differs by operating system, but look for
lines in the command output of the following form.</para>
<literallayout class="monospaced">See /var/....log for a detailed log of this operation.</literallayout>
</section>
<section xml:id="troubleshoot-reset-admin-passwords">
<title>Resetting Administrator Passwords</title>
<para>This section describes what to do if you forgot the password for
Directory Manager or for the global (replication) administrator.</para>
<procedure xml:id="reset-directory-manager-password">
<title>Resetting the Directory Manager's Password</title>
<indexterm>
<primary>Resetting passwords</primary>
<secondary>cn=Directory Manager</secondary>
</indexterm>
<para>OpenDJ directory server stores the entry for Directory Manager in
the LDIF representation of its configuration. You must be able to edit
directory server files in order to reset Directory Manager's password.</para>
<step>
<para>Generate the encoded version of the new password using the OpenDJ
<command>encode-password</command> command.</para>
<screen>$ cd /path/to/opendj/bin/
$ /encode-password --storageScheme SSHA512 --clearPassword password
Encoded Password: "{SSHA512}yWqHnYV4a5llPvE7WHLe5jzK27oZQWLIlVcs9gySu4TyZJMg
NQNRtnR/Xx2xces1wu1dVLI9jVVtl1W4BVsmOKjyjr0rWrHt"</screen>
</step>
<step>
<para>Stop OpenDJ directory server while you edit the configuration.</para>
<screen>$ /stop-ds</screen>
</step>
<step>
<para>Find Directory Manager's entry, which has DN <literal>cn=Directory
Manager,cn=Root DNs,cn=config</literal>, in
<filename>/path/to/opendj/config/config.ldif</filename>, and carefully
replace the <literal>userpassword</literal> attribute value with the
encoded version of the new password, taking care not to leave any
whitespace at the end of the line.</para>
<programlisting language="ldif"
>dn: cn=Directory Manager,cn=Root DNs,cn=config
objectClass: person
objectClass: inetOrgPerson
objectClass: organizationalPerson
objectClass: ds-cfg-root-dn-user
objectClass: top
userpassword: {SSHA512}yWqHnYV4a5llPvE7WHLe5jzK27oZQWLIlVcs9gySu4TyZJMg
NQNRtnR/Xx2xces1wu1dVLI9jVVtl1W4BVsmOKjyjr0rWrHt
givenName: Directory
cn: Directory Manager
ds-cfg-alternate-bind-dn: cn=Directory Manager
sn: Manager
ds-pwp-password-policy-dn: cn=Root Password Policy,cn=Password Policies
,cn=config
ds-rlim-time-limit: 0
ds-rlim-lookthrough-limit: 0
ds-rlim-idle-time-limit: 0
ds-rlim-size-limit: 0</programlisting>
</step>
<step>
<para>Start OpenDJ directory server again.</para>
<screen>$ /start-ds</screen>
</step>
<step>
<para>Verify that you can administer the server as Directory Manager using
the new password.</para>
<screen>$ /dsconfig -p 4444 -h `hostname` -D "cn=Directory Manager" -w password
&gt;&gt;&gt;&gt; OpenDJ configuration console main menu
What do you want to configure?
...
Enter choice: q</screen>
</step>
</procedure>
<procedure xml:id="reset-repl-admin-password">
<title>To Reset the Global Administrator's Password</title>
<indexterm>
<primary>Resetting passwords</primary>
<secondary>Global (replication) administrator</secondary>
</indexterm>
<para>When you enable replication, part of the process involves creating a
global administrator and setting that user's password. This user is present
on all replicas. If you chose default values, this user has DN
<literal>cn=admin,cn=Administrators,cn=admin data</literal>. You reset the
password as you would for any other user, though you do so as Directory
Manager.</para>
<step>
<para>Use the <command>ldappasswordmodify</command> command to reset the
global administrator's password</para>
<screen>$ cd /path/to/opendj/bin/
$ /ldappasswordmodify
--useStartTLS
--port 1389
--hostname opendj.example.com
--bindDN "cn=Directory Manager"
--bindPassword password
--authzID "cn=admin,cn=Administrators,cn=admin data"
--newPassword password
The LDAP password modify operation was successful</screen>
</step>
<step>
<para>Let replication copy the password change to other replicas.</para>
</step>
</procedure>
</section>
<section xml:id="troubleshoot-enable-debug-logging">
<title>Enabling Debug Logging</title>
<indexterm><primary>Debug log</primary></indexterm>
<indexterm>
<primary>Logs</primary>
<secondary>Debug</secondary>
</indexterm>
<para>OpenDJ can write debug information and stack traces to the server
debug log. What is logged depends both on debug targets that you create,
and also on the debug level that you choose.</para>
<procedure xml:id="configure-debug-logging">
<title>To Configure Debug Logging</title>
<step>
<para>Enable the debug log, <filename>opendj/logs/debug</filename>, which
is not enabled by default.</para>
<screen>$ dsconfig
set-log-publisher-prop
--hostname opendj.example.com
--port 4444
--bindDN "cn=Directory Manager"
--bindPassword password
--publisher-name "File-Based Debug Logger"
--set enabled:true
--set default-debug-level:all
--no-prompt
--trustAll</screen>
<para>You can set <literal>default-debug-level</literal> to a less verbose
level if necessary.</para>
</step>
<step>
<para>Create a debug target or targets.</para>
<para>No debug targets are enabled by default.</para>
<screen>$ dsconfig
list-debug-targets
--hostname opendj.example.com
--port 4444
--bindDN "cn=Directory Manager"
--bindPassword password
--publisher-name "File-Based Debug Logger"
--no-prompt
--trustAll
Debug Target : debug-level : debug-category
-------------:-------------:---------------
$ </screen>
<para>A debug target specifies a fully-qualified OpenDJ Java package,
class, or method for which to log debug messages at the level you
specify.</para>
<screen>$ dsconfig
create-debug-target
--hostname opendj.example.com
--port 4444
--bindDN "cn=Directory Manager"
--bindPassword password
--publisher-name "File-Based Debug Logger"
--type generic
--target-name org.opends.server.api
--set debug-level:all
--no-prompt
--trustAll</screen>
</step>
<step>
<para>Restart OpenDJ to see debug messages in the log.</para>
<screen>$ /path/to/opendj/bin/stop-ds --restart
...
$ tail -f /path/to/opendj/logs/debug
...</screen>
<para>If you have set <literal>debug-level:all</literal>, OpenDJ generates
a great deal of output in the debug log file. Use debug logging very
sparingly on production systems.</para>
</step>
</procedure>
</section>
<section xml:id="troubleshoot-use-lockdown-mode">
<title>Preventing Access While You Fix Issues</title>
<indexterm><primary>Lockdown mode</primary></indexterm>
<para>Misconfiguration can potentially put OpenDJ in a state where you must
intervene, and where you need to prevent users and applications
from accessing the directory until you are done fixing the problem.</para>
<para>OpenDJ provides a <firstterm>lockdown mode</firstterm> that allows
connections only on the loopback address, and allows only operations
requested by root users, such as <literal>cn=Directory
Manager</literal>. You can use lockdown mode to prevent all but
administrative access to OpenDJ in order to repair the server.</para>
<para>To put OpenDJ into lockdown mode, the server must be running. You
cause the server to enter lockdown mode by using a task. Notice that
the modify operation is performed over the loopback address (accessing
OpenDJ on the local host).</para>
<screen>$ ldapmodify
--port 1389
--bindDN "cn=Directory Manager"
--bindPassword password
--defaultAdd
dn: ds-task-id=Enter Lockdown Mode,cn=Scheduled Tasks,cn=tasks
objectClass: top
objectClass: ds-task
ds-task-id: Enter Lockdown Mode
ds-task-class-name: org.opends.server.tasks.EnterLockdownModeTask
Processing ADD request for
ds-task-id=Enter Lockdown Mode,cn=Scheduled Tasks,cn=tasks
ADD operation successful for DN
ds-task-id=Enter Lockdown Mode,cn=Scheduled Tasks,cn=tasks</screen>
<para>OpenDJ logs a notice message in <filename>logs/errors</filename>
when lockdown mode takes effect.</para>
<literallayout class="monospaced">
[30/Jan/2012:17:04:32 +0100] category=BACKEND severity=NOTICE msgID=9896350
msg=Lockdown task Enter Lockdown Mode finished execution</literallayout>
<para>Client applications that request operations get a message concerning
lockdown mode.</para>
<screen>$ ldapsearch --port 1389 --baseDN "" --searchScope base "(objectclass=*)" +
SEARCH operation failed
Result Code: 53 (Unwilling to Perform)
Additional Information: Rejecting the requested operation because the server
is in lockdown mode and will only accept requests from root users over
loopback connections</screen>
<para>You also leave lockdown mode by using a task.</para>
<screen>$ ldapmodify
--port 1389
--bindDN "cn=Directory Manager"
--bindPassword password
--defaultAdd
dn: ds-task-id=Leave Lockdown Mode,cn=Scheduled Tasks,cn=tasks
objectClass: top
objectClass: ds-task
ds-task-id: Leave Lockdown Mode
ds-task-class-name: org.opends.server.tasks.LeaveLockdownModeTask
Processing ADD request for
ds-task-id=Leave Lockdown Mode,cn=Scheduled Tasks,cn=tasks
ADD operation successful for DN
ds-task-id=Leave Lockdown Mode,cn=Scheduled Tasks,cn=tasks</screen>
<para>OpenDJ also logs a notice message when leaving lockdown.</para>
<literallayout class="monospaced">
[30/Jan/2012:17:13:05 +0100] category=BACKEND severity=NOTICE msgID=9896350
msg=Leave Lockdown task Leave Lockdown Mode finished execution</literallayout>
</section>
<section xml:id="troubleshoot-import">
<title>Troubleshooting LDIF Import</title>
<para>By default OpenDJ requires that LDIF data you import respect standards.
In particular, OpenDJ is set to check that entries to import match the
schema defined for the server. You can temporarily bypass this check by using
the <option>--skipSchemaValidation</option> with the
<command>import-ldif</command> command.</para>
<para>OpenDJ also ensures by default that entries have only one structural
object class. You can relax this behavior by using the advanced global
configuration property,
<literal>single-structural-objectclass-behavior</literal>. This can be useful
when importing data exported from Sun Directory Server. For example, to
warn when entries have more than one structural object class instead of
reject such entries being added, set
<literal>single-structural-objectclass-behavior:warn</literal> as
follows.</para>
<screen>$ dsconfig
set-global-configuration-prop
--port 4444
--hostname `hostname`
--bindDN "cn=Directory Manager"
--bindPassword password
--set single-structural-objectclass-behavior:warn
--trustAll
--no-prompt</screen>
<para>By default, OpenDJ also checks syntax for a number of attribute types.
You can relax this behavior as well by using the <command>dsconfig
set-attribute-syntax-prop</command> command. See the list of attribute
syntaxes and use the <option>--help</option> option for further
information.</para>
<para>When running <command>import-ldif</command>, you can use the <option>-R
<replaceable>rejectFile</replaceable></option> option to capture entries that
could not be imported, and the <option>--countRejects</option> option to
return the number of rejected entries as the <command>import-ldif</command>
exit code.</para>
<para>Once you work through the issues with your LDIF data, reinstate the
default behavior to ensure automated checking.</para>
</section>
<section xml:id="troubleshoot-secure-connections">
<title>Troubleshooting TLS/SSL Connections</title>
<para>In order to trust the server certificate, client applications usually
compare the signature on certificates with those of the Certificate
Authorities (CAs) whose certificates are distributed with the client
software. For example, the Java environment is distributed with a key store
holding many CA certificates.</para>
<screen>$ keytool -list -keystore $JAVA_HOME/lib/security/cacerts -storepass changeit
| wc -l
334</screen>
<para>The self-signed server certificates that can be configured during
OpenDJ setup are not recognized as being signed by any CAs. Your software
therefore is configured not to trust the self-signed certificates by
default. You must either configure the client applications to accept the
self-signed certificates, or else use certificates signed by recognized
CAs.</para>
<para>You can further debug the network traffic by collecting debug traces.
To see the traffic going over TLS/SSL in debug mode, configure OpenDJ to dump
debug traces from <literal>javax.net.debug</literal> into the
<filename>logs/server.out</filename> file.</para>
<screen>OPENDJ_JAVA_ARGS="-Djavax.net.debug=all" start-ds</screen>
<section xml:id="troubleshoot-certificate-authentication">
<title>Troubleshooting Certificates &amp; SSL Authentication</title>
<para>Replication uses SSL to protect directory data on the network.
In some configurations, replica can fail to connect to each other due
to SSL handshake errors. This leads to error log messages such as the
following.</para>
<screen>[21/Nov/2011:13:03:20 -0600] category=SYNC severity=NOTICE
msgID=15138921 msg=SSL connection attempt from myserver (123.456.789.012)
failed: Remote host closed connection during handshake</screen>
<itemizedlist>
<para>Notice these problem characteristics in the message above.</para>
<listitem>
<para>The host name, <literal>myserver</literal>, is not fully
qualified.</para>
<para>You should not see non fully qualified host names in the error logs.
Non fully qualified host names are a sign that an OpenDJ server has not
been configured properly.</para>
<para>Always install and configure OpenDJ using fully-qualified host names.
The OpenDJ administration connector, which is used by the
<command>dsconfig</command> command, and also replication depend upon SSL
and, more specifically, self-signed certificates for establishing SSL
connections. If the host name used for connection establishment does not
correspond to the host name stored in the SSL certificate then the SSL
handshake can fail. For the purposes of establishing the SSL connection,
a host name like <literal>myserver</literal> does not match
<literal>myserver.example.com</literal>, and vice versa.</para>
</listitem>
<listitem>
<para>The connection succeeded, but the SSL handshake failed, suggesting
a problem with authentication or with the cipher or protocol negotiation.
As most deployments use the same Java Virtual Machine, and the same JVM
configuration for each replica, the problem is likely not related to SSL
cipher or protocol negotiation, but instead lies with authentication.</para>
</listitem>
</itemizedlist>
<orderedlist>
<para>Follow these steps on each OpenDJ server to check whether the problem
lies with the host name configuration.</para>
<listitem>
<para>Make sure each OpenDJ server uses only fully qualified host names in
the replication configuration. You can obtain a quick summary by running
the following command against each server's configuration.</para>
<screen>$ grep ds-cfg-replication-server: config/config.ldif | sort | uniq</screen>
</listitem>
<listitem>
<para>Make sure that the host names in OpenDJ certificates also contain
fully qualified host names, and correspond to the host names found in the
previous step.</para>
<screen># Examine the certificates used for the administration connector.
$ keytool -list -v -keystore config/admin-truststore
-storepass `cat config/admin-keystore.pin` |grep "^Owner:"
# Examine the certificates used for replication.
$ keytool -list -v -keystore config/ads-truststore
-storepass `cat config/ads-truststore.pin`| grep "^Owner:"
</screen>
</listitem>
</orderedlist>
<para>Sample output for a server on host <literal>opendj.example.com</literal>
follows.</para>
<screen>$ grep ds-cfg-replication-server: config/config.ldif |sort | uniq
ds-cfg-replication-server: opendj.example.com:8989
ds-cfg-replication-server: opendj.example.com:9989
$ keytool -list -v -keystore config/admin-truststore
-storepass `cat config/admin-keystore.pin` | grep "^Owner:"
Owner: CN=opendj.example.com, O=Administration Connector Self-Signed Certificate
$ keytool -list -v -keystore config/ads-truststore
-storepass `cat config/ads-truststore.pin`| grep "^Owner:"
Owner: CN=opendj.example.com, O=OpenDJ Certificate
Owner: CN=opendj.example.com, O=OpenDJ Certificate
Owner: CN=opendj.example.com, O=OpenDJ Certificate</screen>
<itemizedlist>
<para>Unfortunately there is no easy solution to badly configured host
names. It is often easier and quicker simply to reinstall your OpenDJ
servers remembering to use fully qualified host names everywhere.</para>
<listitem>
<para>When using the <command>setup</command> tool to install and
configure a server ensure that the <option>-h</option> option is
included, and that it specifies the fully qualified host name. Make sure
you include this option even if you are not enabling SSL/StartTLS LDAP
connections (see <link
xlink:href="https://bugster.forgerock.org/jira/browse/OPENDJ-363"
>OPENDJ-363</link>).</para>
<para>If you are using the GUI installer, then make sure you specify the
fully qualified host name on the first page of the wizard.</para>
</listitem>
<listitem>
<para>When using the <command>dsreplication</command> tool to enable
replication make sure that any <option>--host</option> options include the
fully qualified host name.</para>
</listitem>
</itemizedlist>
<orderedlist>
<para>If you cannot reinstall the server, follow these steps.</para>
<listitem>
<para>Disable replication in each replica.</para>
<screen>$ dsreplication
disable
--disableAll
--port <replaceable>adminPort</replaceable>
--hostname <replaceable>hostName</replaceable>
--bindDN "cn=Directory Manager"
--adminPassword <replaceable>password</replaceable>
--trustAll
--no-prompt</screen>
</listitem>
<listitem>
<para>Stop and restart each server in order to clear the in-memory ADS
trust store backend.</para>
</listitem>
<listitem>
<para>Enable replication making certain that fully qualified host names
are used throughout</para>
<screen>$ dsreplication
enable
--adminUID admin
--adminPassword <replaceable>password</replaceable>
--baseDN dc=example,dc=com
--host1 <replaceable>hostName1</replaceable>
--port1 <replaceable>adminPort1</replaceable>
--bindDN1 "cn=Directory Manager"
--bindPassword1 <replaceable>password</replaceable>
--replicationPort1 <replaceable>replPort1</replaceable>
--host2 <replaceable>hostName2</replaceable>
--port2 <replaceable>adminPort2</replaceable>
--bindDN2 "cn=Directory Manager"
--bindPassword2 <replaceable>password</replaceable>
--replicationPort2 <replaceable>replPort2</replaceable>
--trustAll
--no-prompt</screen>
</listitem>
<listitem>
<para>Repeat the previous step for each remaining replica. In other words,
host1 with host2, host1 with host3, host1 with host4, ..., host1 with
hostN.</para>
</listitem>
<listitem>
<para>Initialize all remaining replica with the data from host1.</para>
<screen>$ dsreplication
initialize-all
--adminUID admin
--adminPassword password
--baseDN dc=example,dc=com
--hostname <replaceable>hostName1</replaceable>
--port 4444
--trustAll
--no-prompt</screen>
</listitem>
<listitem>
<para>Check that the host names are correct in the configuration and in
the key stores by following the steps you used to check for host name
problems. The only broken host name remaining should be in the key and
trust stores for the administration connector.</para>
<screen>$ keytool -list -v -keystore config/admin-truststore
-storepass `cat config/admin-keystore.pin` |grep "^Owner:"</screen>
</listitem>
<listitem>
<para>Stop each server, and then fix the remaining admin connector
certificate as described here in the procedure <link
xlink:href="admin-guide#replace-key-pair"
xlink:role="http://docbook.org/xlink/role/olink"><citetitle>To Replace a
Server Key Pair</citetitle></link>.</para>
</listitem>
</orderedlist>
</section>
<section xml:id="troubleshoot-compromised-key">
<title>Handling Compromised Keys</title>
<indexterm><primary>Certificates</primary></indexterm>
<indexterm><primary>SSL</primary></indexterm>
<para>As explained in <link xlink:href="admin-guide#chap-change-certs"
xlink:role="http://docbook.org/xlink/role/olink" xlink:show="new"><citetitle
>Changing Server Certificates</citetitle></link>, OpenDJ directory server
has different keys and key stores for different purposes. The public keys
used for replication are also used to encrypt shared secret symmetric keys
for example to encrypt and to sign back ups. This section looks at what to
do if either a key pair or secret key is compromised.</para>
<itemizedlist>
<para>How you deal with the problem depends on which key was
compromised.</para>
<listitem>
<para>For a key pair used for a client connection handler and with a
certificate signed by a certificate authority (CA), contact the CA for
help. The CA might choose to publish a certificate revocation list (CRL)
that identifies the certificate of the compromised key pair.</para>
<para>Also make sure you replace the key pair. See <link
xlink:href="admin-guide#replace-key-pair" xlink:show="new"
xlink:role="http://docbook.org/xlink/role/olink"><citetitle>To Replace a
Server Key Pair</citetitle></link> for specific steps.</para>
</listitem>
<listitem>
<para>For a key pair used for a client connection handler and that has
a self-signed certificate, follow the steps in <link
xlink:href="admin-guide#replace-key-pair" xlink:show="new"
xlink:role="http://docbook.org/xlink/role/olink"><citetitle>To Replace a
Server Key Pair</citetitle></link>, and make sure the clients remove the
compromised certificate from their trust stores, updating those trust
stores with the new certificate.</para>
</listitem>
<listitem>
<para>For a key pair that is used for replication, mark the key as
compromised as described below, and replace the key pair. See <link
xlink:href="admin-guide#replace-ads-cert" xlink:show="new"
xlink:role="http://docbook.org/xlink/role/olink"><citetitle>To Replace a
Server Key Pair</citetitle></link> for specific steps.</para>
<orderedlist>
<para>To mark the key pair as compromised, follow these steps.</para>
<listitem>
<para>Identity the key entry by searching administrative data on the
server whose key was compromised.</para>
<para>The server in this example is installed on
<literal>opendj.example.com</literal> with administration port
<literal>4444</literal>.</para>
<screen>$ ldapsearch
 --port 1389
 --hostname opendj.example.com
 --baseDN "cn=admin data"
 "(cn=opendj.example.com:4444)" ds-cfg-key-id
dn: cn=opendj.example.com:4444,cn=Servers,cn=admin data
ds-cfg-key-id: 4F2F97979A7C05162CF64C9F73AF66ED</screen>
<para>The key ID, <literal>4F2F97979A7C05162CF64C9F73AF66ED</literal>, is
the RDN of the key entry.</para>
</listitem>
<listitem>
<para>Mark the key as compromised by adding the attribute,
<literal>ds-cfg-key-compromised-time</literal>, to the key entry.</para>
<para>The attribute has generalized time syntax, and so takes as its
value the time at which the key was compromised expressed in generalized
time. In the following example, the key pair was compromised at 8:34 AM
UTC on March 21, 2013.</para>
<screen width="81">$ ldapmodify
--port 1389
--hostname opendj.example.com
--bindDN "cn=Directory Manager"
--bindPassword password
dn: ds-cfg-key-id=4F2F97979A7C05162CF64C9F73AF66ED,cn=instance keys,cn=admin data
changetype: modify
add: ds-cfg-key-compromised-time
ds-cfg-key-compromised-time: 201303210834Z
Processing MODIFY request for ds-cfg-key-id=4F2F97979A7C05162CF64C9F73AF66ED,
cn=instance keys,cn=admin data
MODIFY operation successful for DN ds-cfg-key-id=4F2F97979A7C05162CF64C9F73AF66ED
,cn=instance keys,cn=admin data</screen>
</listitem>
<listitem>
<para>If the server uses encrypted or signed data, then the shared secret
keys used for encryption or signing and associated with the compromised
key pair should also be considered compromised. Therefore, mark all
shared secret keys encrypted with the instance key as compromised.</para>
<para>To identify the shared secret keys, find the list of secret keys
in the administrative data whose <literal>ds-cfg-symmetric-key</literal>
starts with the key ID of the compromised key.</para>
<screen>$ ldapsearch
--port 1389
--bindDN "cn=Directory Manager"
--bindPassword password
--baseDN "cn=secret keys,cn=admin data"
"(ds-cfg-symmetric-key=4F2F97979A7C05162CF64C9F73AF66ED*)" dn
dn: ds-cfg-key-id=fba16e59-2ce1-4619-96e7-8caf33f916c8,cn=secret keys,cn=admin d
ata
dn: ds-cfg-key-id=57bd8b8b-9cc6-4a29-b42f-fb7a9e48d713,cn=secret keys,cn=admin d
ata
dn: ds-cfg-key-id=f05e2e6a-5c4b-44d0-b2e8-67a36d304f3a,cn=secret keys,cn=admin d
ata</screen>
<para>For each such key, mark the entry with
<literal>ds-cfg-key-compromised-time</literal> as shown above for the
instance key.</para>
</listitem>
</orderedlist>
<para>Changes to administration data are replicated to other OpenDJ
servers in the replication topology.</para>
</listitem>
<listitem>
<para>For a shared secret key used for data encryption that has been
compromised, mark the key entry with
<literal>ds-cfg-key-compromised-time</literal> as shown in the example
above that demonstrates marking the instance key as compromised.</para>
<para>Again, changes to administration data are replicated to other OpenDJ
servers in the replication topology.</para>
</listitem>
</itemizedlist>
</section>
</section>
<section xml:id="troubleshoot-connections">
<title>Troubleshooting Client Operations</title>
<para>By default OpenDJ logs information about all LDAP client operations in
<filename>logs/access</filename>, and all HTTP client operations in
<filename>logs/http-access</filename>. The following lines are wrapped for
readability, showing a search for the entry with
<literal>uid=bjensen</literal> as traced in the LDAP access log. In the access
log itself, each line starts with a time stamp.</para>
<screen>[27/Jun/2011:17:23:00 +0200] CONNECT conn=19 from=127.0.0.1:56641
to=127.0.0.1:1389 protocol=LDAP
[27/Jun/2011:17:23:00 +0200] SEARCH REQ conn=19 op=0 msgID=1
base="dc=example,dc=com" scope=wholeSubtree filter="(uid=bjensen)" attrs="ALL"
[27/Jun/2011:17:23:00 +0200] SEARCH RES conn=19 op=0 msgID=1
result=0 nentries=1 etime=3
[27/Jun/2011:17:23:00 +0200] UNBIND REQ conn=19 op=1 msgID=2
[27/Jun/2011:17:23:00 +0200] DISCONNECT conn=19 reason="Client Unbind"</screen>
<para>As you see, each client connection and set of LDAP operations are
traced, starting with a time stamp and information about the operation
performed, then including information about the connection, the operation
number for the sequence of operations performed by the client, a message
identification number, and additional information about the operation.</para>
<para>To match HTTP client operations with related internal server operations,
first prevent OpenDJ from suppressing internal operations from the LDAP access
log by using the <command>dsconfig</command> command to set the LDAP access
log publisher <literal>suppress-internal-operations</literal> advanced
property to <literal>false</literal>. Then match the values of the
<literal>x-connection-id</literal> field in the HTTP access log with
<literal>conn=<replaceable>id</replaceable></literal> values in the LDAP
access log.</para>
<para>For example, consider an HTTP GET request for the <literal>_id</literal>
field of the user <literal>newuser</literal>, which is handled by connection 4
as shown in <filename>logs/http-access</filename>.</para>
<screen>- 192.168.0.12 bjensen 22/May/2013:16:27:52 +0200
GET /users/newuser?_fields=_id HTTP/1.1 200
curl/7.21.4 4 12</screen>
<para>With internal operations logged in <filename>logs/access</filename>,
log lines for the related operations have <literal>conn=4</literal>.</para>
<screen>[22/May/2013:16:27:52 +0200] CONNECT conn=4
from=192.168.0.12:63593 to=192.168.0.12:8080 protocol=HTTP/1.1
[22/May/2013:16:27:52 +0200] SEARCH REQ conn=4
op=0 msgID=0 base="ou=people,dc=example,dc=com" scope=wholeSubtree
filter="(&amp;(objectClass=inetOrgPerson)(uid=bjensen))" attrs="1.1"
[22/May/2013:16:27:52 +0200] SEARCH RES conn=4
op=0 msgID=0 result=0 nentries=1 etime=5
[22/May/2013:16:27:52 +0200] BIND REQ conn=4
op=1 msgID=1 version=3 type=SIMPLE
dn="uid=bjensen,ou=People,dc=example,dc=com"
[22/May/2013:16:27:52 +0200] BIND RES conn=4
op=1 msgID=1 result=0 authDN="uid=bjensen,ou=People,dc=example,dc=com"
etime=3
[22/May/2013:16:27:52 +0200] SEARCH REQ conn=4
op=2 msgID=2 base="uid=newuser,ou=people,dc=example,dc=com" scope=baseObject
filter="(objectClass=*)" attrs="uid,etag"
[22/May/2013:16:27:52 +0200] SEARCH RES conn=4
op=2 msgID=2 result=0 nentries=1 etime=4
[22/May/2013:16:27:52 +0200] UNBIND REQ conn=4
op=3 msgID=3
[22/May/2013:16:27:52 +0200] DISCONNECT conn=4
reason="Client Unbind"</screen>
<para>To help diagnose errors due to access permissions, OpenDJ supports the
get effective rights control. The control OID,
<literal>1.3.6.1.4.1.42.2.27.9.5.2</literal>, is not allowed by the default
global ACIs. You must therefore add access to use the get effective rights
control when not using it as Directory Manager.</para>
<section xml:id="troubleshoot-simple-paged-results">
<title>Clients Need Simple Paged Results Control</title>
<para>For Solaris and some versions of Linux you might see a message in
the OpenDJ access logs such as the following.</para>
<literallayout class="monospaced">
The request control with Object Identifier (OID) "1.2.840.113556.1.4.319"
cannot be used due to insufficient access rights</literallayout>
<para>This message means clients are trying to use the <link xlink:show="new"
xlink:href="http://tools.ietf.org/html/rfc2696">simple paged results
control</link> without authenticating. By default, OpenDJ includes a global
ACI to allow only authenticated users to use the control.</para>
<screen>$ dsconfig
--port 4444
--hostname opendj.example.com
--bindDN "cn=Directory Manager"
--bindPassword "password"
get-access-control-handler-prop
Property : Value(s)
-----------:-------------------------------------------------------------------
enabled : true
global-aci : (extop="1.3.6.1.4.1.26027.1.6.1 || 1.3.6.1.4.1.26027.1.6.3 ||
...
: (targetcontrol="1.3.6.1.1.12 || 1.3.6.1.1.13.1 || 1.3.6.1.1.13.2
: || <emphasis role="strong">1.2.840.113556.1.4.319</emphasis> || 1.2.826.0.1.3344810.2.3 ||
: 2.16.840.1.113730.3.4.18 || 2.16.840.1.113730.3.4.9 ||
: 1.2.840.113556.1.4.473 || 1.3.6.1.4.1.42.2.27.9.5.9") (version
: 3.0; acl "Authenticated users control access"; allow(read)
: userdn="ldap:///all";), (targetcontrol="2.16.840.1.113730.3.4.2 ||
: 2.16.840.1.113730.3.4.17 || 2.16.840.1.113730.3.4.19 ||
: 1.3.6.1.4.1.4203.1.10.2 || 1.3.6.1.4.1.42.2.27.8.5.1 ||
: 2.16.840.1.113730.3.4.16") (version 3.0; acl "Anonymous control
: access"; allow(read) userdn="ldap:///anyone";)</screen>
<para>To grant anonymous (unauthenticated) user access to the control,
add the OID for the simple paged results control to the list of those in
the <literal>Anonymous control access</literal> global ACI.</para>
<screen>$ dsconfig
--port 4444
--hostname opendj.example.com
--bindDN "cn=Directory Manager"
--bindPassword "password"
set-access-control-handler-prop
--remove global-aci:"(targetcontrol=\"2.16.840.1.113730.3.4.2 ||
2.16.840.1.113730.3.4.17 || 2.16.840.1.113730.3.4.19 ||
1.3.6.1.4.1.4203.1.10.2 || 1.3.6.1.4.1.42.2.27.8.5.1 ||
2.16.840.1.113730.3.4.16\") (version 3.0; acl \"Anonymous control access\";
allow(read) userdn=\"ldap:///anyone\";)"
--add global-aci:"(targetcontrol=\"2.16.840.1.113730.3.4.2 ||
2.16.840.1.113730.3.4.17 || 2.16.840.1.113730.3.4.19 ||
1.3.6.1.4.1.4203.1.10.2 || 1.3.6.1.4.1.42.2.27.8.5.1 ||
2.16.840.1.113730.3.4.16 || <emphasis role="strong">1.2.840.113556.1.4.319</emphasis>\")
(version 3.0; acl \"Anonymous control access\"; allow(read)
userdn=\"ldap:///anyone\";)"
--no-prompt</screen>
<para>Alternatively, stop OpenDJ, edit the corresponding ACI carefully in
<filename>/path/to/opendj/config/config.ldif</filename>, and restart OpenDJ.
<footnote><para>Unlike the <command>dsconfig</command> command, the
<filename>config.ldif</filename> file is not a public interface, so this
alternative should not be used in production.</para></footnote></para>
</section>
</section>
<section xml:id="troubleshoot-repl">
<title>Troubleshooting Replication</title>
<indexterm>
<primary>Replication</primary>
<secondary>Troubleshooting</secondary>
</indexterm>
<para>Replication can generally recover from conflicts and transient issues.
Replication does, however, require that update operations be copied
from server to server. It is therefore possible to experience temporary
delays while replicas converge, especially when the write operation load is
heavy. OpenDJ's tolerance for temporary divergence between replicas is what
allows OpenDJ to remain available to serve client applications even when
networks linking the replicas go down.</para>
<para>In other words, the fact that directory services are loosely convergent
rather than transactional is a feature, not a bug.</para>
<para>That said, you may encounter errors. Replication uses its own error log
file, <filename>logs/replication</filename>. Error messages in the log file
have <literal>category=SYNC</literal>. The messages have the following form.
Here the line is folded for readability.</para>
<screen>[27/Jun/2011:14:37:48 +0200] category=SYNC severity=INFORMATION msgID=14680169
msg=Replication server accepted a connection from 10.10.0.10/10.10.0.10:52859
to local address 0.0.0.0/0.0.0.0:8989 but the SSL handshake failed. This is
probably benign, but may indicate a transient network outage or a
misconfigured client application connecting to this replication server.
The error was: Remote host closed connection during handshake</screen>
<para>OpenDJ maintains historical information about changes in order to
bring replicas up to date, and to resolve replication conflicts. To prevent
historical information from growing without limit, OpenDJ purges historical
information after a configurable delay
(<literal>replication-purge-delay</literal>, default: 3 days). A replica
can become irrevocably out of sync if you restore it from a backup archive
older than the purge delay, or if you stop it for longer than the purge
delay. If this happens to you, disable the replica, and then reinitialize it
from a recent backup or from a server that is up to date.</para>
</section>
<section xml:id="troubleshoot-get-help">
<title>Asking For Help</title>
<para>When you cannot resolve a problem yourself, and want to ask for help,
clearly identify the problem and how you reproduce it, and also the version
of OpenDJ you use to reproduce the problem. The version includes both a
version number and also a build time stamp.</para>
<screen>$ dsconfig --version
OpenDJ <?eval ${docTargetVersion}?>
Build <replaceable>yyyymmddhhmmss</replaceable>Z</screen>
<itemizedlist>
<para>Be ready to provide additional information, too.</para>
<listitem>
<para>The output from the <command>java -version</command> command.</para>
</listitem>
<listitem>
<para><filename>access</filename> and <filename>errors</filename> logs
showing what the server was doing when the problem started occurring</para>
</listitem>
<listitem>
<para>A copy of the server configuration file,
<filename>config/config.ldif</filename>, in use when the problem started
occurring</para>
</listitem>
<listitem>
<para>Other relevant logs or output, such as those from client applications
experiencing the problem</para>
</listitem>
<listitem>
<para>A description of the environment where OpenDJ is running, including
system characteristics, host names, IP addresses, Java versions, storage
characteristics, and network characteristics. This helps to understand
the logs, and other information.</para>
</listitem>
</itemizedlist>
</section>
</chapter>