We've had an ES cluster installed locally on our dev DMA pair, checking the cluster health status it's coming back yellow, looking at the indexes all the ones created by DataMiner are being created with 3 primary shards and 2 replicas in a 2 node cluster.
Why are we getting 3 shards for each index, surely we only need 1?
Why are the replicas set to 2 instead of 1?
As far as I'm aware these nodes were installed using the "Install Indexing Engine..." under Search & Indexing in the System Centre.
Hello,
I would not think about the amount of primary shards we configure. This is purely related to scaling and 3 is the default and should be fine.
We configure it to 2 as default (replication). This was just a choice made by us but it's a 'magic' number. You could change this. However, cluster health yellow is normal in this case. It's not bad, yellow is just fine. If you add more nodes, it will obviously turn green but there's no side effects apart from fewer replication.
I would like to point out that 2 nodes is worst case scenario for an Elastic cluster. A detailed explenation of the why, you can find here: Configuring the master nodes | DataMiner Docs.
Keeping it as two node cluster will result in the ES stability and will require some maintenance, data loss and re-configuration when it results in the split brain situations.
Following is the command you can run to update the ES configuration for number of replicas. .
curl -XPUT ‘localhost:9200/_settings’ -d ‘
{
“index” : {
“number_of_replicas” : 1
}
}
‘
FYI as NATS doesn’t play ball nicely with 2 node clusters, a single hot redundant pair doesn’t really work anymore if the node with the lower IP goes offline, as it causes the other node to crash, so we’d only have a 50% chance of a cluster that was still running anyway.
We’ve got the number of nodes for an active cluster set to 2 to prevent the split brain issue.
Thanks Thomas,
I’d done some more reading on primary and replication, which makes sense on the split of each index for scaling and load balance.
This is just our dev system so will only ever have a 2 node cluster. It would be nice to change the default replication to 1 to get it to go green, if you could point me in the right direction for changing this setting.