Hi, a user is intending to implement failover pair via shared hostname. The user has northbound systems interfacing to the DMS's Web Services API. They would like some advice on a recommended approach to implement health checks on a failover pair whereby the outcome of such a check is to establish which DMA is the active and which is the standby at any one time. The end goal is to avoid the situation whereby northbound systems happen to send request to the standby and has to wait for a request timeout to realize that a DMA switchover has occurred.
For example, one idea is for a load-balancer device to perform the health check. It will periodically send a specific API call to both DMAs in a failover pair. It is expected that only the active DMA will respond while the request will timeout on the standby DMA. An API call that is relatively low-cost call like ConnectApp() followed by a GetDataMinerAgentTime() would be used. Or is there a better way of performing the health check?
Hi Bing,
The easiest way will be to retrieve the contents of /logging/Failoverstatus.txt from both agents. This will contain "Online" for the active agent and "Offline" for the backup agent (or other values while they are switching, for example "Waiting for other agent to go offline" or "Preparing to go online").
If you have a shared hostname dma.company.lan that resolves to IP addresses { 10.0.0.1, 10.0.0.2 } you can retrieve
https://10.0.0.1/logging/Failoverstatus.txt
https://10.0.0.2/logging/Failoverstatus.txt
This will most likely cause a TLS warning for hostname mismatch unless you have included the IP addresses in the certificate.
This can also fail if the the IIS port binding requires Server Name Indication and there is no separate binding for the IP address. In that case your health check script will need to:
- resolve the hostname to the IP addresses,
- establish a TCP/TLS connection to each IP address (and not the hostname),
- ignore the TLS hostname mismatch warning, or manually check the Subject Alternate Names list of the certificate agains the hostname,
- compose a HTTP request where the Host header contains the proper hostname:
GET /logging/Failoverstatus.txt HTTP/1.1
Host: dma.company.lan