91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina<refsect1 id='failover'>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <title>FAILOVER</title>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina The failover feature allows back ends to automatically switch to
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina a different server if the current server fails.
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <refsect2 id='failover_syntax'>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <title>Failover Syntax</title>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina The list of servers is given as a comma-separated list; any
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina number of spaces is allowed around the comma. The servers are
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina listed in order of preference. The list can contain any number
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina of servers.
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina For each failover-enabled config option, two variants exist:
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <emphasis>primary</emphasis> and <emphasis>backup</emphasis>.
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina The idea is that servers in the primary list are preferred and
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina backup servers are only searched if no primary servers can be
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina reached. If a backup server is selected, a timeout of 31 seconds
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina is set. After this timeout SSSD will periodically try to reconnect
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina to one of the primary servers. If it succeeds, it will replace
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina the current active (backup) server.
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </refsect2>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <refsect2 id='failover_mechanism'>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <title>The Failover Mechanism</title>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina The failover mechanism distinguishes between a machine and a
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina service. The back end first tries to resolve the hostname of a
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina given machine; if this resolution attempt fails, the machine is
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina considered offline. No further attempts are made to connect
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina to this machine for any other service. If the resolution
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina attempt succeeds, the back end tries to connect to a service
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina on this machine. If the service connection attempt fails,
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina then only this particular service is considered offline and
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina the back end automatically switches over to the next service.
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina The machine is still considered online and might still be tried
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina for another service.
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina Further connection attempts are made to machines or services
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina marked as offline after a specified period of time; this is
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina currently hard coded to 30 seconds.
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina If there are no more machines to try, the back end as a whole
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina switches to offline mode, and then attempts to reconnect
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina every 30 seconds.
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </refsect2>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <refsect2 id='failover_tuning'>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <title>Failover time outs and tuning</title>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina Resolving a server to connect to can be as simple as running
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina a single DNS query or can involve several steps, such as finding
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina the correct site or trying out multiple host names in case some
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina of the configured servers are not reachable. The more complex
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina scenarios can take some time and SSSD needs to balance between
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina providing enough time to finish the resolution process but on
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina the other hand, not trying for too long before falling back
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina to offline mode. If the SSSD debug logs show that the server
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina resolution is timing out before a live server is contacted,
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina you can consider changing the time outs.
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina This section lists the available tunables. Please refer to their
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina description in the
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <citerefentry>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <refentrytitle>sssd.conf</refentrytitle><manvolnum>5</manvolnum>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </citerefentry>,
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina manual page.
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <variablelist>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <varlistentry>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <term>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina dns_resolver_op_timeout
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </term>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <listitem>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina How long would SSSD talk to a single DNS server.
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </listitem>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </varlistentry>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <varlistentry>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <term>
12d99da163b1efef7e982f04e03049e012857baePavel Březina dns_resolver_timeout
12d99da163b1efef7e982f04e03049e012857baePavel Březina </term>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <listitem>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina How long would SSSD try to resolve a failover
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina service. This service resolution internally might
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina include several steps, such as resolving DNS SRV
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina queries or locating the site.
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </listitem>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </varlistentry>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </variablelist>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina For LDAP-based providers, the resolve operation is performed
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina as part of an LDAP connection operation. Therefore, also the
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <quote>ldap_opt_timeout></quote> timeout should be set to
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina a larger value than <quote>dns_resolver_timeout</quote>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina which in turn should be set to a larger value than
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina <quote>dns_resolver_op_timeout</quote>.
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </para>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina </refsect2>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina</refsect1>
91cf6f4c6069d6aff01aab171825e83a1a669e2fPavel Březina