docbkx/integrators-guide/chap-synchronization.xml

	chap-synchronization.xml revision 7d19a158b53f47b175ba1e6aad07c79365847ae6
<?xml version="1.0" encoding="UTF-8"?>
<!--
  ! CCPL HEADER START
  !
  ! This work is licensed under the Creative Commons
  ! Attribution-NonCommercial-NoDerivs 3.0 Unported License.
  ! To view a copy of this license, visit
  ! http://creativecommons.org/licenses/by-nc-nd/3.0/
  ! or send a letter to Creative Commons, 444 Castro Street,
  ! Suite 900, Mountain View, California, 94041, USA.
  !
  ! You can also obtain a copy of the license at
  ! legal/CC-BY-NC-ND.txt.
  ! See the License for the specific language governing permissions
  ! and limitations under the License.
  !
  ! If applicable, add the following below this CCPL HEADER, with the fields
  ! enclosed by brackets "[]" replaced with your own identifying information:
  !      Portions Copyright [yyyy] [name of copyright owner]
  !
  ! CCPL HEADER END
  !
  !      Copyright 2011-2012 ForgeRock AS
  !
-->
<chapter xml:id='chap-synchronization'
 xmlns='http://docbook.org/ns/docbook'
 version='5.0' xml:lang='en'
 xmlns:xsi='http://www.w3.org/2001/XMLSchema-instance'
 xsi:schemaLocation='http://docbook.org/ns/docbook http://docbook.org/xml/5.0/xsd/docbook.xsd'
 xmlns:xlink='http://www.w3.org/1999/xlink'
 xmlns:xinclude='http://www.w3.org/2001/XInclude'>
 <title>Configuring Synchronization</title>

 <section xml:id="synchronization-section-id"><title>Introduction</title>
 <para>Synchronization of user and other object between different resources and the repository is one of the key tasks of Identity Management and OpenIDM in particular. As usual, synchronization is a rather complex task and so is its configuration.</para>
 </section>

 <section xml:id="sync-triggers-id"><title>Different Triggers to Start Synchronization</title>
  <para>A synchronization engine is expected to update changed objects automatically. Two mechanisms are used to trigger the update of changed objects, a push mechanism for changes done to objects inside OpenIDM and a poll mechanism with which OpenIDM polls in changes from external resources.</para>
  <section><title>Push Updates to External Resources</title>
   <para>If a change is applied to OpenIDM itself, for instance through the REST interface, then OpenIDM will immediately push the changes to all those external resources which are configured to receive these updates. There is no starting of a Reconciliation or LiveSync process needed to make this happen.</para>
  </section>
  <section><title>Pull Updates from External Resources</title>
   <para>OpenIDM has two mechanisms to pull changes from external resources:</para>
   <itemizedlist>
    <listitem><para>Reconciliation</para></listitem>
    <listitem><para>LiveSync</para></listitem>
   </itemizedlist>
   <section><title>Reconciliation</title>
   <para>In identity management <firstterm>reconciliation</firstterm> is the
   process of bidirectional synchronization of objects between different data
   stores. Reconciliation applies primarily to user objects, though other
   objects such groups or roles can also be reconciled. To perform
   reconciliation, OpenIDM must analyze both the source and target systems
   to uncover differences that it must reconcile.</para>
   <para>In OpenIDM, reconciliation picks up changes from external resources
   based on the mapping properties you configure in the <literal>sync</literal>
   configuration, exposed in the <filename>openidm/conf/sync.json</filename>
   configuration file, or in the scheduler configuration. You can add the
   mapping to the scheduler for example when the mapping for reconciliation
   needs to be different from the mapping for LiveSync.</para>
  </section>
  <section><title>LiveSync</title>
   <para>OpenIDM <firstterm>LiveSync</firstterm> makes use of a change log on
   the resource to run as a pull process. LiveSync is configured using the
   same mapping properties as reconciliation. For LiveSync, OpenIDM applies
   the mapping defined in <filename>openidm/conf/sync.json</filename>.</para>
   </section>
  </section>
  <section><title>Triggered Updates</title>
   <para>For picking up changes of objects from external resources, OpenIDM needs to be told be a trigger mechanism to do so. Two different processes, Reconciliation and LiveSync are available to pick up external changes. Both can be triggered through a scheduler configuration as described in chapter "Scheduled Tasks and Events".</para>
   <para>Reconciliation can also be triggered on the REST interface with a POST command to the URL http://&lt;hostname&gt;:&lt;port&gt;/openidm/sync?_action=recon&amp;mapping=&lt;Name of the Mapping&gt;. Here an example using the curl command:</para>
   <example>
    <programlisting>
$ curl -X POST http://localhost:8080/openidm/sync?_action=recon\&amp;mapping=systemLdapAccounts_managedUser
    </programlisting>
   </example>
  </section>

  <section xml:id="livesync-reconciliation">
   <title>LiveSync &amp; Reconciliation</title>

   <para>LiveSync is intended to react quickly to changes as they happen. Yet,
   LiveSync is a best effort mechanism that in some cases can miss
   changes.</para>

   <para>Furthermore, not all resources support LiveSync's event-style
   processing. The resource must provide OpenIDM with a list of changed
   objects such as OpenDJ does with its external change log, where OpenIDM
   needs to store only the last change it encountered between requests to
   search the external change log. Active Directory also provides a change
   log, for example.</para>

   <para>Reconciliation is more thorough, and can recognize more system
   conditions in addition to catching changes that LiveSync could miss.
   Reconciliation is the basis for compliance and reporting
   functionality.</para>

   <para>Reconciliation can be a heavyweight process. When working with large
   data sets, find all the changes can be more work than processing the
   changes.</para>

   <para></para>
  </section>
 </section>

 <section xml:id="sync-data-model-id"><title>The Data Model of OpenIDM</title>
  <section><title>Repository Attributes</title>
   <para>There are two different flavors in identity management software concerning the data which should be persisted in the identity managers repository. Some prefer a Metadirectory like configuration where almost any attribute of the connected external resources is mirrored in the local repository. Others prefer a setup where only a minimum set of attributes is stored localy and all other attributes which a user might have are loaded into the system on demand in a transient way, i.e. they are not stored in the local store. Both possibilities have pros and cons. The Metadirectory way gives fast access to the data, which is kind of cached locally, but on the risk of being not up to date. The minimum data approach would guaranty to show the latest version of the data, though on the cost of performance.</para>
   <para>OpenIDM does not defined any schema or set of Attributes to be stored in the repository. Any attribute which is configured as a target in a mapping pointing to the repository will be stored there. The schema of the repository database will be dynamically updated if a new attribute is mapped. Therefore OpenIDM can be used for both scenarios: the Metadirectory scenario, the minimum data scenario or any solution in between.</para>
  </section>
  <section><title>Using SCIM</title>
   <para>OpenIDM lets you define the schema for storing data in the repository.
   One possibility is to follow the attribute names defined in the Simple Cloud
   Identity Management (<link
   xlink:href="http://www.simplecloud.info/specs/draft-scim-core-schema-00.html"
   >SCIM</link>) specification.</para>
   <!-- TODO: Provide the configuration for SCIM, preferably not just as
              documentation. -->
  </section>
  <section><title>Components for Data Synchronization</title>
   <para>OpenIDM involves two configuration file types and a database table in the synchronization process: sync.json, all provisioner.&lt;resourceName&gt;.json and the link table. The configuration files are used to configure synchronization by the administrators where the link table is maintained by OpenIDM to link objects on different resources (or the repository) which belong to the same user.</para>
  </section>
 </section>

 <section xml:id="basic-flow-sec-id"><title>Basic Data Flow Configuration</title>
  <section><title>Config Files Involved</title>
   <para>Before a value of an attribute from an external resource is stored in the OpenIDM repository or even another external resource, there are several configuration files involved at which the attribute name might be mapped, the attribute value might be changed or combined with other attribute.</para>
   <itemizedlist>
    <listitem><para>provisioner.&lt;resource name&gt;.json - one file for each resource</para></listitem>
    <listitem><para>sync.json - one file per OpenIDM installation</para></listitem>
   </itemizedlist>
   <para>Further more not all attributes of a user on the external resource need to be synchronized. Not even all accounts need to be synchronized. Many of the configuration, especially attribute name mappping is done in the different provisioner.&lt;name&gt;.json files as described in chapter <olink targetdoc="chap-resource-conf.xml" targetptr="connect-prop-id">Resource Connector Configuration</olink>. The other main important configuration file is called sync.json and contains the core configuration of how objects and attributes which flow through OpenIDM need to be handled.</para>
   <section><title>Attribute Name Mapping in provisioner.&lt;resource name&gt;.json</title>
    <para>The attribute name mapping of the provisioner files is configured in the "objectTypes" part. As described in the chapter <olink targetdoc="chap-resource-conf.xml" targetptr="connect-prop-id">Resource Connector Configuration</olink> each attribute is mapped from its nativeName, that is the attribute name as known to the external resource, to an attribute name which will be know to OpenIDM's sync engine. Additionally to attribute name mapping the attribute's type and whether it is single or multivalued is configured here. The "name" attribute is an example for a single valued attribute of the type "string" and "homePhone" is an example for a multivalued attribute, also of type "string".</para>
    <example xml:id="ldap-provisioner">
     <title>Attribute Mapping in Provisioner Configuration</title>
     <programlisting>
{
    "name" : "myLdap",
    ...
"objectTypes" : {
        "account" : {
    ...
    "lastName" : {
            "type" : "string",
            "required" : true,
            "nativeName" : "sn",
            "nativeType" : "string"
              },
    "homePhone" : {
            "type" : "array",
            "items" : {
                "type" : "string",
                "nativeType" : "string"
            },
            "nativeName" : "homePhone",
            "nativeType" : "string"
         }
     ...
      }
   }
}
     </programlisting>
     <caption><para>Two attributes as defined in a provisioner configuration</para></caption>
    </example>
    <para>No attribute construction or value manipulation is configured in the provisioner configurations.</para>
   </section>
   <section><title>The sync.json</title>
    <para>The sync.json is the core configuration for the synchronization engine. Here is the place to configure which objects should be synchronized as well as attribute value manipulation like combining values or manipulating values. The sync.json contains a number of mappings where each mapping connects two resources (including the repository) and is identified by a unique "name" property.</para>
    <section><title>Mappings and the link Table</title>
     <para>Each mapping in the sync.json file contains the configuration between a source and a target resource, where each resource can serve as a source resource in one mapping and as a target resource in another mapping. It is very common that pair of resources which should be synchronized with each other occur on two mappings.</para>
     <para>For instance, if there is an ldap resource called myLDAP which should be synchronized with the local repository in both directions then there would be one mapping with myLDAP as the source and the repository as the target and a second mapping with the repository as the source and myLDAP as the target.</para>
     <para>OpenIDM will remember the connection of an object in the repository which belongs to an object in myLDAP, i.e. the tow objects represent the same user on the different systems, in a link table. On the first synchronization OpenIDM will create a type of link for the new mapping. In most cases it will be desirable that the two mappings use the same link information, since the connection of the two objects is the same for both directions. The rule is:</para>
     <itemizedlist>
      <listitem><para>If a mapping does not have a "links" property then a new type is created for the mapping during the first dynchronization.</para></listitem>
      <listitem><para>If the second mapping should use the same link type, the it needs to have a "links" property with the name of the first mapping as the value.</para></listitem>
     </itemizedlist>
     <example xml:id="links-example-id">
      <title>Handling Links in Mappings</title>
      <programlisting>
{
    "mappings" : [
        {
            "name" : "systemmyLDAPAccounts_managedUser",
            "source" : "system/myLDAP/account",
            "target" : "managed/user",
            ...
         },{
            "name" : "managedUser_systemMyLDAPAccounts",
            "source" : "managed/user",
            "target" : "system/myLDAP/account",
            "links" : "systemmyLDAPAccounts_managedUser",
            ...
         }]
}
      </programlisting>
     </example>
     <para>In the example <xref linkend="links-example-id"/> the first mapping does not have any "links" property. Therefore the associated link type will be called "systemmyLDAPAccounts_managedUser". The second mapping has a "links" property defined and this will be used for maintaining the links.</para>
    </section>
    <section><title>Selecting the participating Resources</title>
     <para>Each resource will usually get one or two mappings in the sync.json file, one for each direction. The resource is identified in the mapping by either the "source" or the "target" property following a "/" separated path: paths starting with "managed" point to objects in the repository, where paths starting with "system" point to external resources. The most right portion of the path specifies the type of object which will be handled: in example <xref linkend="basic-ldap-mapping"/> the "target" with value "managed/user" points to objects of type user in the repository. In the same example the "source" property points to a resource specified by the name "myLdap" and the object types "account". This is the "objectType" which id defined in the provisioner configuration as shown in example <xref linkend="ldap-provisioner"/>.</para>
     <figure xml:id="object-paths-figure">
     <title>Name Spaces and Object paths in the OpenIDM Object Model</title>
      <mediaobject>
       <imageobject>
        <imagedata fileref="images/ServiceTree.png" format="PNG"/>
       </imageobject>
      </mediaobject>
     </figure>
    </section>
    <section><title>Selecting the Objects</title>
     <para>The OpenIDM sync engine per default handles all objects of the right type which are delivered by the resource connector. Possibilities to filter accounts on the connector side depend on the connector itself. For instance in case of the standard LDAP connector it is possible to limit the scope by the baseDn, by specifying an objectclass or even standard ldap filter.</para>
     <para>In the sync engine itself more filtering can be done with the properties validSource and validTarget. Both properties use a script as the value. An example is shown in the example <xref linkend="basic-ldap-mapping"/>. Following is a description of both properties. See <xref linkend="jscript-id"/> on how to use java script here.</para>
     <variablelist>
      <varlistentry>
       <term>validSource</term>
       <listitem>
        <para>A script that determines if a source object is valid to be
         mapped. The script yields a boolean value: <literal>true</literal>
         indicates the source object is valid; <literal>false</literal> can be
         used to defer mapping until some condition is met. In the root scope,
         the source object is provided in the <literal>"source"</literal>
         property. If the script is not specified, then all source objects are
         considered valid.</para>
       </listitem>
      </varlistentry>
      <varlistentry>
       <term>validTarget</term>
       <listitem>
        <para>A script used during reconciliation that determines if a target
         object is valid to be mapped. The script yields a boolean value:
         <literal>true</literal> indicates the target object is valid;
         <literal>false</literal> indicates that the target object should not be
         included in reconciliation. In the root scope, the source object is
         provided in the <literal>"target"</literal> property. If the script is
         not specified, then all target objects are considered valid for
         mapping.</para>
       </listitem>
      </varlistentry>
     </variablelist>
    </section>
    <section><title>Selecting the Attributes</title>
     <para>Like with objects, the list of attributes which will be synced is configured in both, the provisioner config and the sync.json. Only attributes which are set up in the "objectTypes" part of the connector's provisioner configuration, part "objectTypes" will be handled by the connector. In the example <xref linkend="ldap-provisioner"/> the connector would hand over attributes with names fullname and homePhone to the sync engine. In the mapping shown in <xref linkend="basic-ldap-mapping"/> the two attributes are configured as sources and their values would be stored as displayName and homePhone in the target system, which is the repositroy in this example. This is set in the "properties" part of the mapping. </para>
    </section>
  <section><title>Example Mapping LDAP to Repo</title>
   <example xml:id="basic-ldap-mapping">
    <title>An example provisioner file which maps users from an LDAP resourced into the OpenIDM repository</title>
    <programlisting>
{
    "mappings" : [
        {
            "name" : "systemLdapAccounts_managedUser",
            "source" : "system/myLdap/account",
            "target" : "managed/user",
            "properties" : [
                { "target" : "familyName", "source" : "lastName" },
                { "target" : "homePhone", "source" : "homePhone" },
                { "target" : "phoneExtension",  "default" : "0047" },
                { "target" : "mail",
                    "comment" : "mail is only set if there is a value coming from the resource."
                    "source" : "email",
                    "condition" : {
                      "type" : "text/javascript",
                      "source" : "(source.email != null)"
                      }
                },
                { "target" : "displayName",
                    "transform" : {
                       "type" : "text/javascript",
                       "source" : "(source.lastName +', ' + source.firstName;)"
                    }
                }
            ]
        }
    ]
}
    </programlisting>
   </example>
  </section>
    <section><title>Attribute Handling in Mappings</title>
     <para>As pointed out in the last paragraph, each attribute which should be handled by the sync engine needs to have an entry in the "properties" part of the mapping for the source and target resources which should should be synced. Each property in the "properties" part needs to sepcify at least the "target". This is the name of the attribute where the value will be stored, for instance "displayName" in the example below.</para>
     <para>In the case of "displayName and homePhone the value is simply take from the source object which is specified by the "source" property which contains the attribute name of the attribute in the connector. </para>
     <para>For the next target in the example below, "phoneExtension", there is no value available in the source. Still it is possible to set a default value, "0047" here. It will be set to the same value for each user. If this is not appropriate then it might be useful to have the condition on the property as shown in the example for the mail address.</para>
     <para>Here in the example it is assumed that there is a mail address coming from the resource, but not for all users. The email address should only be set if there is a value coming from the resource. If the script in the condition property returns false, then the target "mail" is not handled synchronized.</para>
     <para>The target "mail" below also shows how comments can be set in the "properties".</para>
     <para>Another frequent requirement is to construct attribute values during synchronization. This can be done with the property called "transform". It uses a script again. In the example below "transform" is used to construce the "displayName" from three components: The lastName attribute from the resource, a litteral ", " and the firstName attribute from the resource again.</para>
     <para>Please see the next chapter for more details on how to use scripts in the mapping.</para>
     <section xml:id="jscript-id"><title>Using JavaScript</title>
      <para>Some of the most important flexibility of the OpenIDM sync engine is provided by the ability to use scripts in the mappings. Currently only java script is available bot other scripting languages like groove will follow soon. As example <xref linkend="jscript-examples-id"/> shows, scripts can be included in two ways: by adding the script to the "source" property or by referencing a file which contains the script in a "file" property. The "correlationQuery" property refers to a file which contains the script. The path which is either absolute or relative to the openidm folder. For ease of maintenance it is recommended to use relative paths and to store all scripts in the folder "openidm/scritp" which is there per default.</para>
      <section><title>Available Values in Scripts in sync.json</title>
       <para>During synchronization between a source and a target resources there are always the the two objects "source" and "target" availabl. The source object contains all the attributes coming from the source and the target object contains the attributes which will be sent to the target. All attributes of these objects are accessible in the scripts with the syntax source.&lt;attribute name&gt;. The syntax is used for instance in the example <xref linkend="basic-ldap-mapping"/>. Especially for the target object it might be also interesting to write to the target. </para>
       <para>source, target, object, DB queries read</para>
      <example xml:id="jscript-examples-id">
       <title>Two examples on how to include java script in the properties part of a mapping in sync.json</title>
       <programlisting language="javascript">
"correlationQuery" : {
    "type" : "text/javascript",
    "file" : "script/ldapBackCorrelationQuery.js"
},
"onUpdate" : {
    "type" : "text/javascript",
    "source" : "if ((source.email != null) &amp;&amp; (source.email.length &gt; 0))
        {target.mail = source.email;}"
}</programlisting>
      </example>
      <para>If the java script is directly written in the mapping as a "source" property then the value is configured as a string and put into doubel quotes like "...". This makes the double quotes unusably inside the java script, for instance for strings. Single quotes need to be used instead as shown in the "displayName" property in example <xref linkend="basic-ldap-mapping"/>.</para>
      </section>
     </section>
     <section><title>Using Encrypted Values</title>
      <para>Sometimes it is desirable to not only encrypt the user's passwords when stored in the repository but any other attribute. Examples might be answers to authentication questions as well as credit card or social security numbers.</para>
      <para>The configuration of which attributes should be encrypted before stored in the repository is kept in the file openidm/conf/managed.json as shown in <xref linkend="managed-example-id"/>.</para> <para>The original purpose of the file is to define any managed object which needs to be persisted in the repository. For instance if it is desired to store groups in the repository then it would be needed to have an entry of "name" : "groups" in the same file. Groups would then be available for synchronization as "target" : "managed/group"</para>
      <example xml:id="managed-example-id">
       <title>Encrypting Attributes</title>
        <programlisting>
{
   "objects": [
       {
           "name": "user",
           "properties" : [
               { "name" : "securityanswer",
                 "encryption" : { "key" : "openidm-sym-default" }
               },
               { "name" : "ssn",
                 "encryption" : { "key" : "openidm-sym-default" }
               },
               { "name" : "password",
                 "encryption" : { "key" : "openidm-sym-default" }
               }
           ],

        "onStore" : {
            "type" : "text/javascript",
            "file" : "script/encryptExtraPassword.js"
        },
        "onRetrieve" : {
            "type" : "text/javascript",
            "file" : "script/decrypteExtraPassword.js"
        }
       }
   ]
}
        </programlisting>
        <para>Do not use the default symmetric key in production. See the
        chapter on <link xlink:href="integrators-guide#chap-security"
        xlink:role="http://docbook.org/xlink/role/olink"><citetitle>Securing
        &amp; Hardening OpenIDM</citetitle></link> for more.</para>
      </example>
     </section>
     <section><title>All time Construction</title>
      <para>All actions and mappings which are defined in the "properties" and "policies" part are executed on each run of the sync engine. The mappings in the "properties" are executed if the action as an answer to the situation found is "UPDATE" or "CREATE". If it is required to execute some  action in a specific situation only, then the "onCreate" and "onUpdate" properties can be used as described below.</para>
     </section>
     <section><title>Event Based Construction</title>
      <para>Sometimes it is important to execute an attribute construction or manipulation in only during the creation of an object. Sometimes only during update of existing user objects.</para>
      <section><title>OnCreate</title>
       <para>A rather often used "onCreate" action is to construct the dn of a new user on an LDAP server. After the user was successfully created, the dn will be stored in the target part of the link table and there is no need to construct it again during update. An example for this is shown below.</para>
       <example>
        <title>Construction of the dn in a onCreate property</title>
        <programlisting>
         "onCreate" : {
             "type" : "text/javascript",
             "source" : "target.dn = 'uid=' + source.uid + ',ou=people,dc=example,dc=com'"
         }
        </programlisting>
       </example>
      </section>
      <section><title>OnUpdate</title>
      <para>Similar to the "onCreate" action it might be needed to execute an attribute value manipulation during the update of an object only. In the example below the email address must only be updated if the source email address has a new value. The mail address which is there during creation might not be valid an therefore it should not be synced at that time</para>
      <example xml:id="onUpdet-example-id">
       <title>An onUpdate Script</title>
       <programlisting>
        "onUpdate" : {
            "type" : "text/javascript",
            "source" : "if ((source.email != null) &amp;&amp; (source.email.length > 0)) {target.mail = source.email;}"
        }
       </programlisting>
      </example>
      <para>More actions on specific situations are explained in chapter <xref linkend="advance-dataflow-id"/> below.</para>
      </section>
     </section>
    </section>
   </section>
  </section>
 </section>

 <section><title>Handling of different Situations of an Object</title>
  <section><title>Finding the right Situation</title>
   <para>During synchronization objects can be categorized into different situations depending on the existence of an object in both, source and target of the synchronization and whether a link between two objects on the different resources is already registered or not. If a situation and its action is not defined in a synchronization mapping, then a default action is take. The default actions are listed below.</para>
   <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href='/shared/sec-syncSituations.xml' />
  </section>
  <section><title>Reaction on a given Situation</title>
   <para>In each mapping the appropriate action on a given situation can be defined.</para>
   <example>
    <title>A typical "policies" section of a Mapping in the configuration sync.json</title>
    <programlisting language="javascript">
        "policies" : [ {
            "situation" : "CONFIRMED",
            "action" : "UPDATE"
        }, {
            "situation" : "FOUND",
            "action" : "IGNORE"
        }, {
            "situation" : "ABSENT",
            "action" : "CREATE"
        }, {
            "situation" : "AMBIGUOUS",
            "action" : "IGNORE"
        }, {
            "situation" : "MISSING",
            "action" : "IGNORE"
        }, {
            "situation" : "UNQUALIFIED",
            "action" : "IGNORE"
        }, {
            "situation" : "UNASSIGNED",
            "action" : "IGNORE"
        } ]
    </programlisting>
   </example>
   <xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="/shared/sec-syncActions.xml"/>
  </section>
 </section>

 <section><title>Detailed Steps of the OpenIDM sync Engine</title>
 <para>OpenIDM's sync engine performs the synchronization action in two steps by using different lists: </para>
  <variablelist>
   <varlistentry>
    <term>Source Reconciliation</term>
    <listitem><para>The lists take into account here are the objects found on the source resource and the links which associated with the appropriate mapping.</para></listitem>
   </varlistentry>
   <varlistentry>
    <term>Target Reconciliation</term>
    <listitem><para>Here the sync engine iterates over the objects from the target which have not jet been processed</para></listitem>
   </varlistentry>
  </variablelist>
  <section><title>Source Reconciliation</title>
   <para>When OpenIDM starts a reconciliation or LiveSync it first gets a list of objects from the resource. In case of LiveSync the list contains only objects which had a change, in case of reconciliation a list of all objects which are available through the connector. Keep in mind that the connector might filter out objects already, like through the baseContext or a filter (in case of an LDAP resource for instance). In addition objects can be filtered through the validSource property like mentioned in the <xref linkend="advance-syncing-example-id"/>.</para>
   <para>OpenIDM will then iterate over that list, check each entry against the "validSource" filter and try to classify all entries in one of the situations above (<xref linkend="sync-situations"/>). For doing so it also needs to have the list of existing links for this mapping available. Finally it will execute the configured action for the situation.</para>
  </section>
  <section><title>Target Reconciliation</title>
   <para>During the Source Reconciliation phase OpenIDM can not detect situations where there is no source object, like UNASSIGNED. These situations are detected in the second phase when OpenIDM iterates over all target objects which have not jet been treated duting the first phase, i.e. which do not have a representation on the source.</para>
   <para>OpenIDM will iterate over that list, check the object against the "validTarget" filter, find the appropriate situation and execute the configured acttion.</para>
  </section>
 </section>

 <section><title>Correlation Queries</title>
  <para>If user or object is created through a reconciliation or LiveSync, the OpenIDM's sync engine will also create a link between the two objects. the link is then used for further reconciliations to find the appropriate target object for any further reconciliation or liveSync.</para>
  <para>Especially during the early phases of an identity management project there might be loads of users created from resource A when the next reconciliation with resource B is started. I this case the sync engine needs to find already existing user objects and just link them instead of creating new objects. The correlation of objects is done through a correlation query which is the means to find the corresponding object on the target system.</para>

  <para>In OpenIDM, the query is executed against the target. The syntax of the
  query therefore depends on which system is the target. The syntax is specific
  to the underlying data store or to OpenICF. All native query facilities are
  available.</para>

  <section><title>Correlation query with the Repository as the Target</title>
   <para>As a prerequisite to find an object in the repository, there needs to be a query defined in the file <link>openidm/conf/repo.jdbc.json</link> or <link>openidm/conf/repo.orientdb.json</link>, depending on which type of repository is used. An example for a generic query for a OrientDB repository is shown in <xref linkend="orient-query-example-id"/> and one for a JDBC based repository in <xref linkend="jdbc-query-example-id"/> respectively.</para>
   <para>An example correlation rule to call the query looks like <xref linkend="correlation-toRepo-id"/>. The _query-id property has the name of the query in the repository configuration as a value, here "for-userName". The userName property will get the value of the variable "source.name" which will be put int the query at '${userName}'. </para>
   <example xml:id="correlation-toRepo-id"><title>Correlation Query for Repository</title>
    <programlisting language="javascript">
            "correlationQuery" : {
            "type": "text/javascript",
            "source" : "var query = {'_query-id' : 'for-userName', 'userName' :  source.name};query;"
        },
    </programlisting>
   </example>
   <para>In the example below ${_resource} will be replaced by the table name which contains the users, if the correlation is for objects of type user. At the end of the day it is a simple query against a db table which can return zero, one or more than one objects. The situation of the user, like FOUND or AMBIGUOUS will be set according to the number of returned objects.</para>
   <example xml:id="orient-query-example-id"><title>DB query for Correlation</title>
    <programlisting language="javascript">
"for-userName" : "SELECT * FROM ${_resource} WHERE userName = '${userName}'"
    </programlisting>
   </example>
   <para>In the case of a JDBC repository the query is slightly more complex due to the indexing structure of the tables. In this case three tables need to be joined to formulate the optimum query. The following parameters are set by the system:</para>
   <itemizedlist>
    <listitem><para>${_dbSchema}: contains the name of the schema, usually openidm</para></listitem>
    <listitem><para>${_mainTable}: contins the name of the main table for these objects, in case of user it is managedobjects</para></listitem>
    <listitem><para>${_propTable}: the name of the associated property table, usually managedobjectproperties</para></listitem>
    <listitem><para>The third table which will be joined is the objecttypes table </para></listitem>
   </itemizedlist>
   <example xml:id="jdbc-query-example-id"><title>DB query for Correlation</title>
    <programlisting>
"for-userName" : "SELECT fullobject FROM ${_dbSchema}.${_mainTable} obj INNER JOIN ${_dbSchema}.${_propTable} prop ON obj.id = prop.${_mainTable}_id INNER JOIN ${_dbSchema}.objecttypes objtype ON objtype.id = obj.objecttypes_id WHERE prop.propkey='/userName' AND prop.propvalue = ${userName} AND objtype.objecttype = ${_resource}"
    </programlisting>
    <caption></caption>
   </example>
  </section>

  <section xml:id="correlation-query-system-object">
   <title>Correlation Query With a System Object as the Target</title>
   <para>In the case when a system object is the target, then it is the connector which needs to execute the query for finding target objects.</para>
   <para>The java script needs to return a map containing a generic query with the following elements:</para>
   <itemizedlist>
    <listitem><para>A condition like "Equals"</para></listitem>
    <listitem><para>The name of the attribute to compare on the system object: it is the value for the property "name", in the example below it is "uid"</para></listitem>
    <listitem><para>The value from the source to be used in the seearch filter. It is set in the propety "value" and needs an array as a value. In the example it is [source.userName]</para></listitem>
   </itemizedlist>
   <example xml:base="back-correlation-example-id"><title>Correlation Query for System Objects</title>
    <programlisting language="javascript">
var map = {"query": { "Equals": {"field" : "uid", "values" : [source.userName ]}}};
map;
    </programlisting>
   </example>
  </section>
 </section>

 <section xml:id="advance-dataflow-id"><title>Advanced Data Flow Configuration</title>
  <para>correlation, onUnlink, onDelete, onStore, onRetreive, onValidate</para>
  <para>In <xref linkend="basic-flow-sec-id"/> the extra action which can be plugged in for two standard situations was mentiond: onCreate and on Update.In some cases it might be needed to execute special scripts on other actions as well. For instance if the user object on an external resource was deleted the action in OpenIDM is not necessarely a delete as well. Instead the user should for instance be deactivated. In this case the action would be set to "unlink" and an "onUnlink" script might be needed to deactivate the user in OpenIDM or even another external resource. See for instance <xref linkend="advance-syncing-example-id"/></para>
  <para>Simmilarely it might be desired to delete the object in OpenIDM and still execute an extra script. That would then be done in the "onDelete" property.</para>

  <example xml:id="advance-syncing-example-id">
   <title>An advanced sync.json configuration</title>
   <programlisting linenumbering="numbered" language="javascript">
{
    "mappings" : [ {
        "name" : "systemLdapAccount_managedUser",
        "source" : "system/ldap/account",
        "target" : "managed/user",
        "validSource" : {
            "type" : "text/javascript",
            "file" : "jscript/isValid.js"
        },
        "correlationQuery" : {
            "type" : "text/javascript",
            "file" : "jscript/ldapCorrelationQuery.js"
        },
        "properties" : [ {
            "source" : "uid",
            "transform" : {
                "type" : "text/javascript",
                "source" : "source.toLowerCase()"
            },
            "target" : "userName"
        }, {
            "transform" : {
                "type" : "text/javascript",
                "source" : "if (source.myGivenName) {source.myGivenName;}
                    else {source.givenName;}"
            },
            "target" : "givenName"
        }, {
            "source" : "",
            "transform" : {
                "type" : "text/javascript",
                "source" : "if (source.mySn) {source.mySn;} else {source.sn;}"
            },
            "target" : "familyName"
        }, {
            "source" : "cn",
            "target" : "fullname"
        }, {
            "comment" : "Multi-valued in LDAP, single-valued in AD.
                Retrieve first non-empty value.",
            "source" : "title",
            "transform" : {
                "type" : "text/javascript",
                "file" : "jscript/getFirstNonEmpty.js"
            },
            "target" : "title"
        }, {
            "condition" : {
                "type" : "text/javascript",
                "source" : "var clearObj = openidm.decrypt(object);
                    ((clearObj.password != null) &amp;&amp;
                        (clearObj.ldapPassword != clearObj.password))"
            },
            "transform" : {
                "type" : "text/javascript",
                "source" : "source.password"
            },
            "target" : "__PASSWORD__"
        }],
        "onCreate" : {
            "type" : "text/javascript",
            "source" : "target.ldapPassword = null; target.adPassword = null;
                target.password = null; target.ldapStatus = 'New Account'"
        },
        "onUpdate" : {
            "type" : "text/javascript",
            "source" : "target.ldapStatus = 'OLD'"
        },
        "onUnlink" : {
            "type" : "text/javascript",
            "file" : "jscript/triggerAdDisable.js"
        },
        "policies" : [ {
            "situation" : "CONFIRMED",
            "action" : "UPDATE"
        }, {
            "situation" : "FOUND",
            "action" : "UPDATE"
        }, {
            "situation" : "ABSENT",
            "action" : "CREATE"
        }, {
            "situation" : "AMBIGUOUS",
            "action" : "EXCEPTION"
        }, {
            "situation" : "MISSING",
            "action" : "EXCEPTION"
        }, {
            "situation" : "UNQUALIFIED",
            "action" : "UNLINK"
        }, {
            "situation" : "UNASSIGNED",
            "action" : "EXCEPTION"
        } ]
    }
   </programlisting>
  </example>
  <section><title>Include data from a third resource</title>
  <para>At any time during the reconciliation it is possible to get information form any connected system. The function to call is:</para>
  <para>openidm.read(id)</para>
   <para>TODO</para>
  </section>
 </section>

</chapter>