---
canonical: https://safekit.evidian.com/wp-content/uploads/downloads_safekit/version-82/slides-en/7-farm-module-en.pptx
---

# PowerPoint Converted to Markdown

Source: https://safekit.evidian.com/wp-content/uploads/downloads_safekit/version-82/slides-en/7-farm-module-en.pptx


## Slide 1: SafeKit Farm

_No extractable slide text found._


### Speaker notes

> These slides are timed and automatically move from one to the next after a delay. To remove this automation: Go to 'Slide Show' and uncheck 'Use Timings’.
> The slides have a soundtrack represented by an audio icon on the right side of each slide. To remove the soundtrack, click on each audio icon and lower the volume to the minimum.
> I’m going to present in detail the farm module of SafeKit with its network load balancing and failover features, including how to configure the restart scripts, the internal parameters in userconfig.xml, and its operation with various state transitions according to failures.


## Slide 2

- Overview


### Speaker notes

> Let’s start with an overview.


## Slide 3: How a farm module works?

- Network load balancing
- Automatic Failover
- Failback
- 2 nodes (or more)
- Network load balancing and failover


### Speaker notes

> Let's examine how a farm module works and the associated states and colors in the console.
> At step 1, the farm module is running in the UP-UP state. The application operates on both servers, and the virtual IP is set on the network interface of both nodes as an alias IP. A kernel module named vip handles the load balancing of client TCP sessions between both nodes.
> At step 2, the farm module has either been stopped on node 1 or node 1 has experienced a failure. As a result, all client TCP sessions are now managed by node 2.
> At step 3, the farm module is restarted on node 1, and the load balancing is restored across both nodes.


## Slide 4: Farm modules

- For a new application
- farm.safe for a new Windows application
- farm.safe for a new Linux application
- List of all SafeKit solutions
- With a free trial here
- Preconfigured for Linux
- apache_farm.safe for Apache
- Preconfigured for Windows
- iis_farm.safe for IIS
- apache_farm.safe for Apache
- Configuration files of a module named AM
- 1
- 2
- 3
- 4
- Load balancing and failover


### Speaker notes

> Let’s present to you the various solutions that you can implement with a farm module. Firstly, for a new application on Windows or Linux, you can utilize the farm.safe module. We offer a comprehensive list of SafeKit solutions, all available with a free trial and a quick installation guide.
> Secondly, our preconfigured solutions for Windows include iis farm.safe for Microsoft IIS web service and apache farm.safe for Apache web service and more.
> Thirdly, our preconfigured solutions for Linux include apache farm.safe for Apache web service and more.
> Fourthly, once deployed, the configuration files are available in the modules directory of SafeKit. If the module has been deployed with the name AM, then you will find an AM subdirectory. Inside AM/bin, you will find the restart scripts named startprim and stopprim. And inside AM/CONF, you will find the userconfig.xml file.


## Slide 5

- userconfig.xml


### Speaker notes

> Let's now examine in detail the userconfig.xml parameters that can be configured for a farm module.


## Slide 6: Overview of userconfig.xml

- <!DOCTYPE safe>
- <safe>
- <service mode="farm">
- <farm>
- <lan name="default"/>
- </farm>
- <vip>
- <interface_list>
- <interface check="on">
- <virtual_interface>
- <virtual_addr addr="172.24.199.100" where="alias" check="on"/>
- </virtual_interface>
- </interface>
- </interface_list>
- <loadbalancing_list>
- <group name="FarmProto">
- <rule port="9010" proto="tcp" filter="on_port"/>
- </group>
- </loadbalancing_list>
- </vip>
- <user />
- </service>
- </safe>
- heartbeat configuration for a farm
- virtual IP configuration
- load balancing rules
- module scripts activation
- Slides "Checkers":
- <errd>         process or service monitoring
- <checker>  checkers
- <failover>   failover rules


### Speaker notes

> Here is an overview of the userconfig.xml file for a farm module.
> First, you have the heartbeat section defining the networks through which heartbeats must pass.
> Then, you have the virtual IP configuration.
> Next, you see the definition of load balancing rules.
> Following that, you have the user tag, which enables the execution of restart scripts.
> The three other tags on the right of the slide concern checkers and are explained in the checkers slides.


## Slide 7

- Farm heartbeats


### Speaker notes

> Let’s detail the heartbeats configuration.


## Slide 8: <farm> in userconfig.xml

- To set heartbeats between all nodes of the farm
- Heartbeats synchronize actions among all nodes.
- This helps distribute network traffic from clients according to load balancing rules.
- Name as defined in cluster.xml
- <farm>
- <lan name="default"/>
- <lan name="private"/>
- <!-- As many <lan> as desired network connections -->
- </farm>


### Speaker notes

> Heartbeats synchronize actions among all nodes to distribute network traffic from clients according to load balancing rules.
> The farm tag allows the configuration of heartbeats in userconfig.xml.
> In the farm tag, the name attribute is the name of a LAN defined in the cluster.xml file. Here, two heartbeats are defined on the default and private LANs.
> Finally, you can set as many LAN tags as you have network connections between both nodes.


## Slide 9

- Virtual IP address


### Speaker notes

> Let’s now detail the virtual IP configuration and the load balancing rules.


## Slide 10: Load balancing in the same subnet

- The vip driver is performing load balancing
- by accepting or forwarding incoming packets
- only one vip driver accepts a packet
- according to a hash function
- hash tables in all vip drivers are synchronized


### Speaker notes

> A virtual IP address, also named VIP in the figure on the right, is a third IP address coming in addition to the two physical IP addresses of the two nodes. In a farm module, the virtual IP address is set as an alias on the network interface of both nodes, which run the application.
> Both nodes must be in the same subnet to be able to make the network load balancing and to switch transparently the MAC Ethernet address from one node to the other on failures (at level 2 in the network layers).
> Clients, as presented in the example, are connected to the VIP. When the main node is node 1, the VIP is associated with the Ethernet MAC address 1 of node 1. This mapping can be viewed in the ARP cache of clients.
> The VIP driver is handling load balancing by either accepting or forwarding incoming packets to node 2. Only one VIP driver will accept a packet based on a hash function. The hash tables in all VIP drivers are synchronized to ensure this process runs smoothly.
> When node 1 fails, the ARP cache of clients is updated with the mapping of the VIP to the Ethernet MAC address 2 of node 2. In this case, all clients are reconnected to the application on node 2.


## Slide 11: <vip> in userconfig.xml

- With the default value for optional attributes [ ]
- <vip>
- <interface_list>
- <interface [check="on"]>
- <virtual_interface>
- <virtual_addr addr="172.24.199.100" where="alias" [check="on"]/>
- <!-- As many <virtual_addr> as there are virtual IP on this interface -->
- </virtual_interface>
- </interface>
- <!-- As many <interface> as there are interfaces with virtual address -->
- </interface_list>
- </vip>
- Checker that detects interface failure
- Resource: intf.172.24.199.0
- Action: wait on failure
- Checker that detects duplicate VIP address conflict or removal
- Resource: ip.172.24.199.100
- Action: stopstart on error
- Name or address of the virtual IP
- Prefer an IP address to be DNS independent
- IPv4 or IPv6 address


### Speaker notes

> Virtual IP addresses are configured in the `vip` tag of `userconfig.xml` as presented in the slide.
> First, in the `interface` tag , the `check` attribute set to `on` means that a checker will detect interface failures. As shown in the text box of the slide, a resource is associated with this checker, and the action taken on interface failure is to put the module in the WAIT state.
> Then, `virtual ADDR`, is the tag where the address or DNS name of the virtual IP is set. It is preferable to configure an IP address to be resilient to DNS failures. You can set here an IPv4 or IPv6 address for the virtual IP.
> alias means that the virtual IP address will be set as an alias of the physical IP address on the network interface of each node.
> The `check` attribute set to `on` in the `virtual ADDR` tag means that a checker will detect duplicate VIP address conflicts or removals. As shown in the text box of the slide,  a resource is associated with this checker, and the action taken on error is to stop and then start the module.
> Finally, you can set several virtual IP addresses on the same network interface by replicating the `virtual ADDR` tag. And, you can also replicate the `interface` tag to set virtual IP addresses on several network interfaces.


## Slide 12: <loadbalancing_list> in userconfig.xml

- With default value for optional attributes [ ]
- <vip>
- <loadbalancing_list>
- <group name="FarmProto">
- <rule port="9010" proto="tcp" filter="on_port"/>
- <rule port="22" proto="tcp" filter="on_port"/>
- <rule port="80" proto="tcp" filter="on_addr" [virtual_addr="*"] />
- <!-- As many <rule> as necessary -->
- </group>
- </loadbalancing_list>
- </vip>
- Name of the load balancing group
- resource: lbgroup.FarmProto
- returns the load, in %, taken by a node
- Load balancing rule set on
- port, protocol and virtual IP address
- all vip by default
- Assign to "on_port" for stateless application
  - no TCP session affinity
  - TCP sessions of the same client are load balanced
- Assign to "on_addr" for stateful application
  - TCP session affinity
  - one client is always connected to the same node


### Speaker notes

> The load balancing rules are configured in the `loadbalancing list` tag of `userconfig.xml`.
> First, as shown in the text box, a resource is associated with the name of the load balancing group. This resource returns the load, in percentage, taken by a node.
> Then the load balancing rules are defined.
> For example, a load balancing rule is set on the TCP port 9010.
> This means that when clients initiate TCP connections to the port 9010, their connections will be load balanced between nodes.
> The filter ‘on port’ means that the filtering will be made on the client TCP port.
> In this case, the TCP sessions coming from a same client, can be load balanced between nodes. We are in the case of a stateless application with no TCP session affinity.
> On the contrary, for the TCP port 80, the filter is ‘on ADDR’, which means a filtering on the client IP address.
> In this case, the TCP sessions coming from a same client, are not load balanced between nodes. They all go to the same node. We are in the case of a stateful application with TCP session affinity. Of course, TCP sessions coming from different clients can be load balanced between different nodes.
> Note that in the rule tag, you can define a specific virtual IP address, on which you want the load balancing rule to be applied. By default, the rule is applied to all virtual IP addresses defined in the vip tag. Finally, you can replicate the rule tag to define new load balancing rules.


## Slide 13: Virtual IP address in different subnets

- SafeKit implements
- Healthcheck = URL /var/modules/AM/ready.txt
- OK if UP
- NOT FOUND otherwise
- VIP is defined in the load balancer
- Sends healthcheck
- To the IP addresses of all nodes
- Route the traffic according healthchecks
- VIP @
- OK
- Load-balancer with healthcheck
- VIP @
- OK


### Speaker notes

> We consider now two nodes that are in two different subnets. This is particularly the case when implementing a high availability solution in public cloud infrastructure like Azure, AWS, or GCP. SafeKit nodes are put in two different high availability zones, which are in two different subnets.
> In this case, the vip tag must be removed from userconfig.xml, as the network load balancing at the MAC address level is no longer possible. Instead, the virtual IP must be defined at a load balancer level. The load balancer must be configured with the physical IP addresses of the two nodes, and with a health check to route the traffic to available nodes.
> SafeKit offers such a health check per module, by answering to a URL with the ready.txt file. If the module is UP, the URL returns OK. If not, the URL returns NOT FOUND. Thus, the load balancer routes the traffic to the nodes where the application runs. In the text box of the slide, you have to replace ‘AM’ in the URL, by the name of the module.
> Note that the application must support this configuration with clients connected on the VIP of the load balancer and with the application receiving connections from the load balancer on the physical IP address of its node.
> There is another solution not explained in these slides, which consists of performing load balancing at the DNS level. However, if a node goes down, clients that have resolved the IP address of this node will not be rerouted. They are dependent on a new DNS lookup and also on the DNS cache expiration, which can take several hours.


## Slide 14

- start_both / stop_both scripts


### Speaker notes

> Let's now explain the startboth and stopboth scripts.


## Slide 15: start_both/stop_both scripts

_No extractable slide text found._


### Speaker notes

> The startboth and stopboth scripts are used to start and stop an application. In this solution, you need to install the application with the same settings on two nodes or more. Additionally, you need to configure the clients to connect to the virtual IP address, which will allow the network load balancing of their TCP sessions as well as a failover in case of a node failure.
> Now, let's talk about automatic boot. Remove the automatic start at boot of the application. This start will be managed by the automatic start of the module at boot. To do this, configure the SafeKit module to start at boot by adding boot=on in the service tag of the userconfig.xml file, or by using the safekit boot command.


## Slide 16: Generic scripts

- Available in farm.safe


### Speaker notes

> Generic scripts are included in the farm.safe module, as well as other modules, to eliminate the need for custom script creation. This approach significantly simplifies the integration of new applications, making the process more efficient and user-friendly. You only need to define a list of services in the macro named SERVICES in userconfig.xml. This list is then passed as an environment variable to the startboth and stopboth scripts.
> The startboth script starts all services in the order specified in the list, while the stopboth script stops all services in the reverse order. Additionally, startboth checks the startup of each service and stops the module if any service fails to start correctly. During module configuration, the boot startup of services will automatically be set to ‘Manual’. This ensures that services do not start automatically upon system boot, but instead, they will be initiated only when the module itself is started.


## Slide 17: Example on Windows

- start_both.cmd
- Messages in logs
  - @echo off
  - echo "Running start_both %*"
  - net start "myservice"   /Y
  - if NOT %errorlevel% == 0 goto stop
  - :stop
  - "%SAFE%\safekit" printe "start_both failed"
  - "%SAFE%\safekit" stop -i "start_both"
- Script log
- "Running start_both WAIT UP"
- The myservice service failed to start
- Module log
- 10-20 18:28:12 … start_both failed
- 10-20 18:28:12 … Action stop called by start_both


### Speaker notes

> Let's now explain how to log messages from a restart script, either in the script log or in the module log.
> As shown in points 1 and 2 on the slide, all output messages of startboth go into the script log. Thus, for debugging purposes, you can write specific messages in the script log just with the echo command. More generally, you will find the outputs of service startups and stops in the script log, along with any potential error messages that can help with debugging.
> As shown in point 3 on the slide, by using the safekit print e command, you can log a message in the module log.
> A shown in point 4, when executing a command like the stop one with the dash i startboth option, you will have a stop message in the log, indicating that the stop action was initiated by startboth.


## Slide 18

- Farm state transitions


### Speaker notes

> Let’s now examine the different state transitions inside a farm module.


## Slide 19: Farm module state

_No extractable slide text found._


### Speaker notes

> Here are the main states of a farm module.
> When the state is UP and the color is green on a node with 100% of the traffic, it means that it is the only node running the farm module. The failover is not possible in this case.
> When the state is UP and the color is green on a node with 50% of the traffic, it indicates that two nodes are running the farm module. The failover is possible in this case.
> The STOP state means that the module is not running on the node.
> When a state is in transition, it is indicated by an orange color.
> If the state is WAIT, the color is red, and the message is a failover rule name, it means that the node is waiting for a mandatory resource controlled by a checker before starting.
> Finally, if the state is ERROR, it means that the console cannot connect to the node because either the node has crashed, or there is a communication issue between the console and the node, such as a firewall issue or the web service not running on the node side.


## Slide 20: Start node1

- node1 goes from STOP to UP
- start_both
- start
- Application running on node1 (virtual IP set)
- Load share is 100%, displayed in the "lbgroup" resource
- prestart
- wakeup
- stop
- wait
- stop_both
- poststop
- wait
- stop_both
- node1
- 100%


### Speaker notes

> Let's consider the start of node 1, which goes from the STOP to UP state.
> In the figure on the left, when the start command is executed on node 1, the prestart script is first executed. Normally, the application integrated in the module should not run on the node, but if it is the case, the prestart script makes a preventive stop of the application before installing the virtual IP address.
> Then, the start both script is executed to start the application, and the state is UP orange.
> After the execution of the start both script, the module is in the stable UP green state taking 100% of the traffic, meaning that the virtual IP is set, and the application is running.
> The module can be put in the WAIT red state, if a resource is set to down by a checker.
> Now let's consider the stop of the module by an administrator or a checker. In this case, it transitions from the UP state to the STOP state after executing the stop both and poststop scripts.


## Slide 21: Start node2

- prestart
- wakeup
- stop
- wait
- poststop
- wait
- node2
- start
- node1
- start_both
- stop_both
- stop_both
- Application running on both nodes (virtual IP set on both nodes)
- Load share is 50% for each node
- 50%
- 50%
- 100%


### Speaker notes

> In this slide, we continue the presentation with the start of node 2.
> In the figure on the left, when the start command is executed on node 2, the prestart script is first executed. Normally, the application integrated in the module should not run on the node, but if it is the case, the prestart script makes a preventive stop of the application before installing the virtual address.
> Then, the start both script is executed to start the application, and the state is UP orange.
> After the execution of the start both script, the module is in the stable UP green state taking 50% of the traffic, meaning that the virtual IP is set, and the application is running.
> As shown in the figure in the middle, on node 1, the module transitions to the UP green state, taking the other 50% of the traffic.


## Slide 22: Stop or failure of node2

- Automatic failover on node1
- node1
- Application running on node1
- Load share is 100%
- 50%
- 100%
- node2 (UP)
- stops or fails


### Speaker notes

> When there is a stop or a failure of node 2, node 1 stays in the UP green state but takes 100% of the traffic. Then, the application, which continues its execution on node 1, manages all TCP connections from clients on the virtual IP address.


## Slide 23: Network isolation between nodes

- Load share is 100% for both nodes
- heartbeats KO
- node1
- node2
- heartbeat OK
- 50%
- 50%
- 100%
- 100%
- 50%
- 50%
- On network isolation, by default, each node takes 100% of the traffic
- Coherent if both nodes are in 2 datacenters: each datacenter is connected to its local node
- To avoid both nodes taking 100% of the traffic, configure a second heartbeat


### Speaker notes

> Let’s now explain the implications of network isolation between both nodes.
> When both nodes are isolated, each node takes 100% of the traffic as all heartbeats are lost.
> The system remains coherent if both nodes are in two datacenters: each datacenter is connected on the virtual IP address of its local node.
> Once the isolation is repaired, each node takes 50% of the traffic.
> Network isolation occurs when all heartbeats between node 1 and node 2 are lost.
> As long as there is a live heartbeat on a network, the network isolation cannot occur.
> Therefore, implementing an unbreakable private network, such as a direct Ethernet link between both nodes, can avoid this situation.


## Slide 24: Restart on node1

- Application stops then starts
- restart
- stop_both
- start_both
- Application is restarted on node1
- node1


### Speaker notes

> Let’s now explain the restart action initiated by an administrator or a checker
> The restart action consists of executing the stop both and start both scripts on a node to restart the application locally, without changing the percentage of traffic and without triggering a failover.


## Slide 25: Thank you !

_No extractable slide text found._


### Speaker notes

> Thank you for your attention. If you have any questions or need further clarification, please feel free to ask.
