--- canonical: https://safekit.evidian.com/wp-content/uploads/downloads_safekit/version-82/safekituserguidehtml/documentation/safekituserguideen.htm --- ## 13.7 File replication - , **For mirror modules only** File replication (RFS) ensures high availability, real-time synchronization, and fault tolerance for critical data. Configuring RFS involves the following constraints: · In Linux, you must set the same value for uid/gid on the two nodes for replicating file permissions. When replicating a filesystem mount point, you must apply a special procedure described in section 13.7.4.2. · In Windows, it is strongly recommended to enable the USN journal on the drive that contains the replicated directory as described in section 13.7.4.3. · In Windows, moving a file to the Recycle Bin using the Delete key in File Explorer is not supported. Only permanent deletion using Shift + Delete is supported. · Replicated directories are writable only on the primary node · The replicated directory tree can contain paths with spaces only on Windows · Hard links and file system transactions in Windows are not supported | | | | --- | --- | | Commentaire important contour | If you install and run several application modules on the same server, the replicated directories must be different for each application module. | ### 13.7.1 example · Example in Windows: · Example in Linux: | | | | --- | --- | | Sous-titres contour | See also a full example at section 15.1. For the configuration of a dedicated replication network, refer to section 15.1.2.2. It presents the configuration via the web console along with the corresponding userconfig.xml. | ### 13.7.2 syntax … | | | | --- | --- | | Commentaire, ajouter contour | Only async, nbrei, reitimeout and reidetail attributes of tag can be changed with a dynamic configuration. The tag, describing the replication flow, can also be changed dynamically. | ### 13.7.3 , attributes | | | | --- | --- | | 7.4.0.5 Timeout in seconds for sending TCP packets to the remote node. If a packet cannot be sent within this timeout, the PRIM server becomes ALONE. Increase this value in case of low networks. Default value: 30s (30 seconds) | | | | | --- | --- | --- | | Commentaire, ajouter contour | | Time unit supported since SafeKit 8.2.5 (see section 13.1). | | Commentaire, ajouter contour | In SafeKit 7.4.0.5, the default value was 12O seconds. | | | | | | | | [nbrei="6"] | Number of reintegration threads running in parallel for resynchronizing files. Default value: 6 | | | | --- | --- | | Commentaire, ajouter contour | This attribute’s value can be changed with a dynamic configuration. | | Commentaire, ajouter contour | The default value was 3 before SafeKit 8.2.5.3. | | | [namespacepolicy="0"|"1"|"3"|"4"] | · namespacepolicy="0" Deactivate the zone reintegration on Windows or Linux · namespacepolicy="1" In Windows, zone reintegration after reboot when the module has been properly stopped is not active · namespacepolicy="3" In Windows, it allows zone reintegration after reboot when possible. It activates the USN change journal on the volume containing the replicated directories (see fsutil usn command for creating USN change journal on a volume). Even with this configuration, full reintegration is used instead of zone reintegration when: o the USN change journal associated with the volume has been deleted/recreated for administration reasons o discontinuity in the USN journal is detected · namespacepolicy="4" When zone synchronization is not possible (on the first reintegration or when zones are not available), the files that need to be synchronized are fully copied. If this reintegration does not complete, the next one will copy again these files. To avoid this, set namespacepolicy="4". This option also enables USN journal checking in Windows. Default value: 4 since SafeKit > 7.4.0.5 (not supported in previous releases) | | [reitimeout= "150s"] | Timeout in seconds for reintegration requests. The timeout can be increased to avoid reintegration failure on heavy load of the primary server. Default value: 150s (150 seconds) | | | | | --- | --- | --- | | Commentaire, ajouter contour | | Time unit supported since SafeKit 8.2.5 (see section 13.1). | | Commentaire, ajouter contour | This attribute’s value can be changed with a dynamic configuration. | | | | | | | | [reicommit="0"] | **Linux only** Set reicommit="nb blocks" to commit every (nb blocks)\* reipacketsize when reintegrating one file (in addition to the commit at the end of the copy). This can help to succeed reintegration of big files but slows down reintegration time. Default value: 0 that means no intermediate commit | | [reidetail= "on"|"off"] | Detailed logging for reintegration. Default value: off | | | | --- | --- | | Commentaire, ajouter contour | This attribute’s value can be changed with a dynamic configuration. | | | [allocthreshold= "0"] | **Windows only** Size in Gb to apply the allocation policy before reintegration. When allocthreshold> 0, enable fast allocation of disk space for files to be synchronized on the secondary node. This feature avoids a timeout when the primary writes at the end of the file, when the file is large (> 200 Gb) and not yet completely copied. Since SafeKit 7.4.0.64, the allocation policy has changed and is applied for: · Newly created files (files that did not exist on the secondary when the reintegration starts) · Files with size on the primary >= allocthreshold (size in Go) · Full synchronization on the first reintegration; on start with full synchronization (safekit second|prim fullsync); when synchronization by zones is disabled (namespacepolicy="0") Default value: 0 (that disables the feature) | | [nbremconn="1"] | Number of TCP connections between the primary and the secondary nodes. This value may be increased to improve the replication and synchronization throughput when the network has high latency (in cloud for instance). Default value: 1 | | [checktime= "220000ms"] | **Linux only** Timeout in milliseconds for the null request that checks the local replicated file system. Run the safekit stopstart command when the timeout is reached. Default value: 220000ms (220000 milliseconds) | | | | --- | --- | | Commentaire, ajouter contour | Time unit supported since SafeKit 8.2.5 (see section 13.1). | | | [checkintv= "120s"] | **Linux only** Interval in seconds between two null requests. Default value: 120s (120 seconds) | | | | --- | --- | | Commentaire, ajouter contour | Time unit supported since SafeKit 8.2.5 (see section 13.1). | | | [nfsbox\_options= "cross"|"nocross"] | **Windows only** It specifies the policy to apply when a reparse point of type MOUNT\_POINT is present in the replicated directory tree. This policy applies to all replicated directories. MOUNT\_POINT reparse points in NTFS can represent two types of objects: an NTFS mount point (for example the D:\ directory) or an NTFS "directory junction" (a form of "symbolic link" to another part of the file system namespace). · nfsbox\_options="cross" The MOUNT\_POINT reparse point object itself is not replicated/reintegrated. It is evaluated, and the reintegration/replication process the target content as it would do for the content of a standard directory. This is useful for instance when a replicated directory is a mount point (e.g., replicating a "drive letter" root). This is the default configuration value. · nfsbox\_options="nocross" The MOUNT\_POINT reparse point object itself is replicated/reintegrated but not evaluated. Reintegration does not descend into the target of the reparse point. This is useful for instance when a replicated directory tree contains NTFS "junctions" that point to another part of the replicated tree (e.g., when replicating a PostgreSQL database, as PostgreSQL is known to need such objects). Default value: cross | | [scripts= "on" | "off"] | scripts="on" activates \_rfs\_\* script callbacks used to implement specific data replication management Default value: off | | [reiallowedbw="20000"] | When defined, this attribute specifies the maximum bandwidth that the reintegration phase may use (for instance 20000 KB/s), in kilo bytes per second (KB/s). Due to implementation trade-off, a +/-10% fluctuation of the effectively used bandwidth is to be expected. | | | | --- | --- | | Commentaire, ajouter contour | The replication bandwidth is not affected by this parameter. | By default, the attribute is not defined, and the bandwidth used by the reintegration is not limited | | [syncdelta="0m"] | · syncdelta <=1 The attribute is ignored and the default failover and start policy is applied: only an up-to-date server can start as primary or run a failover. · syncdelta >1 It changes the default failover and start policy. The not up-to-date server can become primary but only if the elapsed time, in minutes, since the last synchronization is lower than the syncdelta value (see section 13.7.4.4). Default value: 0m (0 minute) | | | | --- | --- | | Commentaire, ajouter contour | Time unit supported since SafeKit 8.2.5 (see section 13.1). | | | [syncat="*synchronization scheduling*"] | Default: real-time replication and automatic synchronization (no scheduling) Use syncat for scheduling the synchronization of replicated directories on the secondary node (see section 13.7.4.10). The module must be started for enabling this feature. Once synchronized, the module blocks in the WAIT (NotReady) state until the next synchronization. The scheduling is based on native job scheduler: · On Unix, the job is defined in the safekit user’s crontab · On Windows, the job is defined as a system task You must configure syncat with the syntax of the native job scheduler. For instance, for synchronizing daily, after midnight: · in Windows syncat="/SC DAILY /ST 00:01:00" · in Unix syncat="01 0 \* \* \*" | | | | --- | --- | | Commentaire, ajouter contour | See crontab documentation in Unix and schtasks.exe documentation in Windows, for the full syntax of scheduled date and time. | | | | | --- | --- | | Commentaire important contour | Since SafeKit configuration is just a front end to the job scheduler, when scheduling is not working, please check first for syntax errors. | | | [ [ ] ] | **Legacy** configuration preserved for backwards compatibility. When this section is not defined, the replication flow uses the same network as the heartbeat with ident="flow" if there is one, if not it uses the first heartbeat (see section 13.4). If you define this section, be coherent with heartbeat ident="flow", if there is one, because default failover rules apply to this heartbeat. | | | | --- | --- | | Commentaire, ajouter contour | This tag subtree can be changed with a dynamic configuration for setting a new replication flow for instance. | The name attribute of define the network used for replication flow. It must present in global cluster configuration (see section 12). The tag is a legacy syntax used in previous SafeKit version (before 7.2). It’s supported for compatibility reason but must not be used for new modules. | | | | --- | --- | | Commentaire important contour | In the same userconfig.xml, you must not use the syntax for SafeKit 7.1 and the one for SafeKit 7.2. | | | | Relative path of a file or sub-directory in a replicated directory. The file (or sub-directory) is not replicated. Set as many lines as there are non-replicated files or sub-directories. | | | | --- | --- | | Commentaire important contour | Spaces in file paths are supported only on Windows. | | | | Regular expression on the name of entries under the replicated directory: · **Replicate all except** entries matching the regular expression. For example, to avoid replicating entries with the extension .tmp or .bak in the /safedir directory or its sub-directories: Note that /safedir/conf/config.tmp.swap is replicated. · **Replicate** **only** those entries in the directory that match the regular expression after the **!** For example, to replicate only entries with the extension .mdf or .ldf in the /safedir directory or its sub-directories: | | | | --- | --- | | Commentaire important contour | Rename between not replicated and replicated files is not supported. | The regex engine is POSIX Extended regex (see POSIX documentation): · in Windows, case insensitive mode · in Linux, case sensitive mode | | | | --- | --- | | Commentaire important contour | As regular expressions are defined inside the XML file userconfig.xml, special characters interpreted by XML like '<' or '>' cannot be used in regular expressions. | | | | Relative path of a file or sub-directory in a replicated directory. Checks the presence of the file or sub-directory before starting the replication mechanism. Avoids errors such as starting replication on an empty file system. Set as many lines as there are files or sub-directories to check. | ### 13.7.4 description #### 13.7.4.1 prerequisites See file replication prerequisites described in section 2.2.4. #### 13.7.4.2 Linux On Linux, interception of data is based on a local NFS mount. And the replication flow between servers is based on NFS v3 / TCP protocol. The NFS mount of replicated directories from remote Unix clients is not supported. The NFS mount of other directories can be made with standard commands. Procedure for replicating a mount point When replicating a mount point in Linux, the module configuration fails with the error: Error: Device or resource busy In the following, we take the example of PostgreSQL module that set as replicated directories /var/lib/pgsql/var and /var/lib/pgsql/data. The userconfig.xml of the module contains: These directories are mount points as shown by the result of the command df -H. It returns for instance: /dev/mapper/vg01-lv\_pgs\_var … /var/lib/pgsql/var /dev/mapper/vg02-lv\_pgs\_data … /var/lib/pgsql/data You must apply the following procedure for configuring the module to replicate these directories. 1. umount the file systems by running the commands: umount /var/lib/pgsql/var umount /var/lib/pgsql/data 2. configure the module by running the command: /opt/safekit/safekit config -m postgresql The configuration should succeed (no errors) 3. check the symbolic links created by running the command ls -l /var/lib. It returns: lrwxrwxrwx 1 root var -> var\_For\_SafeKit\_Replication lrwxrwxrwx 1 root data -> data\_For\_SafeKit\_Replication 4. edit /etc/fstab and change the two lines: /dev/mapper/vg01-lv\_pgs\_var /var/lib/pgsql/var ext4… /dev/mapper/vg02-lv\_pgs\_data /var/lib/pgsql/data ext4… With /dev/mapper/vg01-lv\_pgs\_var /var/lib/pgsql/var\_For\_SafeKit\_Replication ext4… /dev/mapper/vg02-lv\_pgs\_data /var/lib/pgsql/data\_For\_SafeKit\_Replication ext4.. 5. mount the file systems by running the commands: mount /var/lib/pgsql/var\_For\_SafeKit\_Replication mount /var/lib/pgsql/data\_For\_SafeKit\_Replication | | | | --- | --- | | Commentaire important contour | · Apply this procedure on both nodes if replicated directories are mount point on both nodes. Once applied, you can use the module as usual: i.e., safekit start stop etc … · It is the same procedure for all mounts points that must be replicated | | | | | --- | --- | | Commentaire, ajouter contour | To protect the start of the module on a non-mounted and empty directory, you can insert in userconfig.xml the checking of a file inside the replicated directory. Example for /var/lib/pgsql/var (do the same for /var/lib/pgsql/data with a file inside this directory which is always present): . | If you want to unconfigure the module (or uninstall whole SafeKit package), you must reverse this procedure by: 1. umount the file systems with: umount /var/lib/pgsql/var\_For\_SafeKit\_Replication umount /var/lib/pgsql/data\_For\_SafeKit\_Replication 2. de-configure the module with /opt/safekit/safekit deconfig -m postgresql 3. edit /etc/fstab to undo previous editing 4. mount the file systems with: mount /var/lib/pgsql/var mount /var/lib/pgsql/data #### 13.7.4.3 Windows On Windows, interception of data is based on a file system filter. And the replication flow between servers is based on NFS v3 / TCP protocol. The rfs filter may not work correctly with some anti-viruses. On Windows, you can mount remotely a replicated directory from a workstation. If you want to mount with the virtual name instead of the digital virtual IP address, you must set the two following registry keys on the server side: [HKEY\_LOCAL\_MACHINE\SYSTEM\CurrentControlSet\Control\Lsa] "DisableLoopbackCheck"=dword:00000001 [HKEY\_LOCAL\_MACHINE\SYSTEM\CurrentControlSet\Services\lanmanserver\parameters] "DisableStrictNameChecking"=dword:00000001 In Windows, to enable zone reintegration after server reboot, when the module has been successfully stopped, the component uses the NTFS USN log to verify that the information recorded on the zones is still valid after the reboot. When the control succeeds, the zone reintegration can be applied to the file; otherwise, the file must be fully copied. By default, only the system drive has a USN log active. If the replicated directories are located on a different drive than the system drive, you must create the log (with fsutil usn command). | | | | --- | --- | | Sous-titres contour | See SK-0066 for an example. | #### 13.7.4.4 replication and failover With its file-replication function, mirror architecture is particularly suitable for providing high availability for back-end applications with critical data to protect against failure. The reason is that the secondary server data is strongly synchronized with the primary server data. A synchronized server is considered as up-to-date and only an up-to-date server can start as primary or run a failover. If the application availability is more critical than the application data, this default policy can be relaxed by allowing a server to become primary if the time elapsed since the last synchronization is below a configurable delay. This is configured by setting the syncdelta attribute of the tag: · syncdelta <= 1 The attribute is ignored and the default failover and start policy is applied. The default value is 0. · syncdelta > 1 When the last up-to-date server is not responding, the not up-to-date server can become primary but only if the elapsed time since the last synchronization is lower than the syncdelta value (in minutes). This feature is implemented with: · rfs.synced resource When syncdelta is > 1, the rfs.synced resource is managed. This resource is UP if the replicated data are consistent and if the elapsed time, in minute since the last synchronization is lower than the syncdelta value. · syncedcheck checker When syncdelta is > 1, this checker is running. It sets the value for the rfs.synced resource. · rfs\_forceuptodate failover rule When syncdelta is > 1, the following failover rule is valid: rfs\_forceuptodate: if (heartbeat.\* == down && cluster() == down && rfs.synced == up && rfs.uptodate == down) then rfs.uptodate=up; This rule leads to the primary start of the server when the up-to-date server is not responding and if the server is isolated and can be considered as synchronized according to syncdelta value. #### 13.7.4.5 replication verification You can check for the module, named *AM*, that files are identical on the primary and the secondary, by running the following command on the SECOND server: safekit rfsverify -m *AM*. Run safekit rfsverify -m *AM* > log to redirect the command output into the file named log. This output of the command is a log like that of the reintegration in which the files to be copied (therefore different) are indicated. When on the primary, there is activity on the replicated directories, an anomaly may be detected while there is no difference between the files in the following cases: · on Windows because modifications are made on disk before being replicated · with async="second" (default) because reads can bypass the asynchronous writes. To check if there is really an inconsistency, you must re-run the command on the secondary server making sure that there is no more activity on the primary. On Windows, some files are systematically seen as erroneous by the verifier while there is no difference. This occurs when files are modified with SetvalidData: files are extended without resetting the new extension and the reads return random data from the disk. | | | | --- | --- | | Commentaire, ajouter contour | It is strongly recommended to run this command only when there are no accesses to the replicated directories on the primary. | #### 13.7.4.6 file changes since the last synchronization Before starting a secondary server, it may be useful to evaluate the number of files and data that have been changed on the primary server since the secondary server has stopped. This feature is provided by running the following command on the ALONE server: safekit rfsdiff -m *AM*. Run safekit rfsdiff -m *AM* > log to redirect the command output into the file named log. This command runs on-line checks of regular files content of the module *AM*. It scans the entire replicated tree and displays the number of files that have been modified as well as the size that need to be copied. It also displays estimation for the synchronization duration. This is only estimation since only regular files are scanned and some other modifications may occur until the synchronization is run by the secondary server. This command must be used with caution on a production server since it leads to an overhead on the server (for reading trees and files with locking). On Windows, rename of files can fail during the evaluation. | | | | --- | --- | | Commentaire, ajouter contour | It is strongly recommended to run this command only when there are no accesses to the replicated directories. | #### 13.7.4.7 replication and reintegration bandwidth The replication component monitors, on the PRIM server, the bandwidth used by replication and reintegration write requests. Two resources (rfs.rep\_bandwidth and rfs.rei\_bandwidth) reflect the average bandwidth used by replication and reintegration respectively during the last 3 seconds, expressed in kilo bytes per second (KB/s). If the replication load is IO intensive, the reintegration phase may saturate the network link and significantly slow down the application. In such a case, the reiallowedbw attribute may be used to limit the bandwidth taken by the reintegration phase (see section 13.7.3). Please note that limiting the reintegration bandwidth will make the reintegration phase longer. There are also 2 resources that reflect the network bandwidth (in in Kbytes/sec) used between nfsbox processes, that run on each node to implement replication and reintegration: · rfs.netout\_bandwidth is the network output bandwidth · rfs.netin\_bandwidth is the network input bandwidth You can observe the value of rfs.netout\_bandwidth on the primary or rfs.netin\_bandwidth on the secondary to know the modification rate at the time of observation (write, create, delete, …). The history of the resource values gives an overview of its evolution over time. The value of the bandwidth depends on the application, system, and network activity. Its measurement is available for information purposes only. #### 13.7.4.8 synchronization by date SafeKit 7.2 offers a new command safekit secondforce -d date -m *AM* that forces the module *AM* to start as secondary after copying only files modified after the specified date. | | | | --- | --- | | Commentaire important contour | This command must be used with cautions since the synchronization will not copy files modified before the specified date. It is the administrator’s responsibility to ensure that these files are consistent and up to date. | The date is in the format of YYYY-MM-DD[Z] or "YYYY-MM-DD hh:mm:ss[Z]" or YYYY-MM-DDThh:mm:ss[Z], where: - YYYY-MM-DD indicates the year, month, and day - hh:mm:ss indicates the hours, minutes, and seconds - Z indicates that the time is in UTC time zone; when not set the time is in local time zone · For instance: - safekit secondforce -d 2016-03-01 -m *AM* for copying only files modified after the 1st of March 2016 - safekit secondforce -d "2016-03-01 12:00:00" -m *AM* for copying only files modified after the 1st of March 2016 at 12h, local time zone - safekit secondforce -d 2016-03-01T12:00:00Z -m *AM* for copying only files modified after the 1st of March 2016 at 12h, UTC time zone This command may be useful in the following case: · the module is stopped on the primary server and a backup of the replicated data is done (on a removable drive for instance) · the module is stopped on the secondary server and the replicated data is restored from the backup. It may be the first start-up or the repair of the secondary server. · the module is started on the primary server that becomes ALONE · the module is started on the secondary with the command safekit secondforce -d date -m *AM* where the date is the backup date In this case, only the files modified since the backup date will be copied (full copy), instead of the full copy of all files. | | | | --- | --- | | Commentaire important contour | In Windows, the file modification date on the secondary server is changed when the file is copied by the synchronization process. Therefore, safekit secondforce -d date -m *AM*, where date is prior to the last reintegration on this server, has no interest. | #### 13.7.4.9 external synchronization On the first synchronization, all replicated files are fully copied from the primary node to the secondary node. During the following synchronizations, necessary when the secondary node comes back, only zones modified, during the secondary downtime, of files that have been modified on the primary node during the secondary node downtime. When the replicated directories are voluminous, the first synchronization can take a lot of time especially if the network is slow. For this reason, since SafeKit> 7.3.0.11, SafeKit provides a new feature to synchronize a large amount of data that must be used in conjunction with a backup tool. On the primary node, simply back up the replicated directories and pass the synchronization policy to the external mode. The backup is transported (using an external drive for instance) and restored to the secondary node, which is also configured to perform external synchronization. When the module starts on the secondary node, it copies only the file areas that were modified on the primary node since the backup The external synchronization relies on a new SafeKit command safekit rfssync that must be applied on both nodes to set the synchronization policy to external. This command requires arguments: · the role of the node (prim | second) · a unique identifier (uid) External synchronization procedure The external synchronization procedure, described below, is the procedure to be followed in the case of a cold backup of the replicated directories. In this case, the application must be stopped, and any modification of the replicated directories is prohibited until the module, and the application are started, in ![](safekituserguideen_fichiers/image359.jpg)ALONE(Ready). The order of operations must be strictly adhered to. ![](safekituserguideen_fichiers/image378.jpg) The external synchronization procedure, described below, is the procedure to be followed in the case of a hot backup of replicated directories. In this case, the module is ![](safekituserguideen_fichiers/image379.jpg)ALONE(Ready); the application is started and changes to the contents of the replicated directories are allowed. The order of operations must be strictly adhered to. ![](safekituserguideen_fichiers/image380.jpg) safekit rfssync command | | | | --- | --- | | safekit rfssync external prim *uid* [-m *AM*] | Set the synchronization policy to external. It is identified by the value of *uid* (at max 24 char). The node is the primary one, the source for synchronizing data. | | safekit rfssync external second *uid* [-m *AM*] | Set the synchronization policy to external. It is identified by the value of uid (at max 24 char). The node is the secondary one, the destination for synchronizing data | | safekit rfssync -d prim *uid* [-m *AM*] safekit rfssync -d second uid [-m *AM*] | Disable the replicated directories change detection between the cold backup/restore and the start of the module. | | | | --- | --- | | Commentaire important contour | Use this option with caution since the external synchronization may not properly detect all changes to be copied. | | | safekit rfssync full [-m *AM*] | Set the synchronization policy to full. This will copy all files in their entirety on the next synchronization. | | safekit rfssync | Display the current synchronization policy | Internals The synchronization policy is represented by module’s resources: usersetting.rfssyncmode, usersetting.rfssyncrole, usersetting.rfssyncuid and rfs.rfssync: · usersetting.rfssyncmode="default" (usersetting.rfssyncrole="default", usersetting.rfssyncuid="default") These values are associated with the standard synchronization policy, which is applied by default. It consists of copying only the modified areas of the files. When this policy cannot be applied, the modified files are copied in their entirety. · usersetting.rfssyncmode="full" (usersetting.rfssyncrole="default", usersetting.rfssyncuid="default") These values are associated with the full synchronization policy. It is applied: · the first time the module is started after its first configuration · on safekit commands (safekit second|prim fullsync ; safekit rfssync full ; safekit primforce ; safekit config ; safekit deconfig) · on change of pairing for the module The full synchronization policy will copy all files in their entirety on the next synchronization. · usersetting.rfssyncmode="external", usersetting.rfssyncrole="prim | second" and usersetting.rfssyncuid="uid" These values are associated with the external synchronization policy assigned with the commands safekit rfssync external prim uid and safekit rfssync external second uid. The next synchronization will apply the external synchronization policy. · rfs.rfssync="up | down" This resource is only up when the synchronization policy, defined by the previous resources, can be applied. When the synchronization policy is not the default policy, the synchronization policy automatically returns to the default mode after successful synchronization. To check the state of resources, see section 7.4. In some cases, external synchronization cannot be applied, and the secondary node stops with an error specified in the module log. In this situation, you must either: · complete the external synchronization procedure if this has not been done in its entirety on the 2 nodes · fully reapply the external synchronization procedure on the 2 nodes · revert to the full synchronization policy (safekit rfssync full command) · apply the synchronization by date, using the date of the backup (see section 13.7.4.8). Unlike external synchronization, synchronization by date will copy the files, modified on the primary node, in their entirety (instead of just modified parts). #### 13.7.4.10 scheduled synchronization By default, SafeKit provides real-time file replication and automatic synchronization. On heavy loaded server or high latency network, you may want to let the secondary node weakly synchronized. For this, you can use the syncat attribute for scheduling replicated directories synchronization on the secondary node. The module must be started for enabling this feature. Once synchronized, the module blocks in the WAIT (NotReady) state until the next synchronization schedule. It is implemented with: · the resource rfs.syncat that is set to up on the scheduled dates and set to down after the data synchronization · the failover rule rfs\_syncat\_wait that blocks the module into the state WAIT (NotReady) until the rfs.syncat resource is up If you want to manually force the synchronization, you can run the command: safekit set -r rfs.syncat -v up -m *AM* while the module is in the WAIT (NotReady) state. With syncat, you just have to configure the scheduled time for the synchronization with the syntax of the native job scheduler: crontab in Linux and schtasks.exe in Windows (see section 13.7.3).