All,
We have been doing some digging through the documentation trying to find recommended latency values for:
1- DMA-DMA
2- Cassandra Node - Cassandra Node
3- Cassandra Seeds/Nodes- DMA/DMS
The only recommendation we have found so far is that of the Indexing Engine (50ms). Could someone here indicate if Skyline has any KPIs that could be used as a guide when installing a DataMiner System? This becomes particularly important once we start supporting the centralized Cassandra clusters, since the cluster could be composed of DMAs and nodes hosted in different data centers.
Thank you in advance,
Hi Rene,
For Cassandra there are two important ways of communication between nodes:
- Internode communications (gossip)
This is needed to provide information between nodes on the Cassandra cluster in a scalable way.
To have good gossip between nodes it is important that you have the same seed list in your config on each node (cassandra.yaml file).
Making every node a seed node is NOT recommended (they recommend three for every DataCenter) => this is hard to achieve with current Cassandra architecture.
A node can be marked down (to prevent a coordinator node to send to this one on every query request).
The threshold to mark a node as down is the phi_convict_threshold (located in the cassandra.yaml file).
The threshold takes into account network performance, workload, and historical conditions.
In simple words: "The node is marked down if the node does not respond within a certain time. This time is the average time * phi_convict_threshold." - Requests (e.g. Read Requests)
A coordinator will wait on another node to handle the request based on a timeout time in ms (settings available in the cassandra.yaml file e.g. read_request_timeout_in_ms). This is important for a query to succeed or not.
If you want to make sure you are working with the latest data stored you need to use the Quorum consistency level on your queries. Meaning that more than 50% of your nodes that hold your data (based on replication factor) will need to answer. In case of a FO setup we build currently a Cassandra cluster of 2 nodes with a replication factor of two, so if we would use Quorum we would need both nodes to be online (or not marked as down) all the time. That is why we use consistency level 1 at the moment.
So in general you can have any latency between your Cassandra nodes, however based on your configuration it will or will not give problems.
Hi Rene – The default settings that are configured should work in normal circumstances.
If there is a need to change these settings, I believe you can also expect some degradation in capabilities, so if possible, try to improve your network instead. When the listen address of your node is listed as a seed it means that the node will start up without contacting a seed in the cluster. If you want to replace a seed node (with the same ip) you first need to remove the ip of the new node from the seed list on all the nodes before you add it to the Cassandra Cluster. In other words a seed is very useful into larger clusters to minimize the communication needed to bring a node online or to join/remove nodes, however we need to be careful when performing actions on seed nodes itself. For replication factor they advise mostly to have at least 3 as from that point you can be consistent with your data and cope with a node going down. To be consistent in your data you need to write/read to more than 50% (R+W >100%) of your nodes to make sure you have the latest data. From the moment your data is replicated/available on three nodes you can cope with one node being done and still have more than 50% online to be consistent. You can have a replication factor of 3 from the moment you have 3 nodes, in this case all nodes will contain 100% of all the tokens.
Michiel – Thank you for your reply. A quick clarification would be: Are all the settings mentioned configurable by us? Meaning, can we set any latency value we want once we learn about the latency expected between data centers? Also, you are now mentioning 3 seeds per data center, but I assume this will become a reality once we go into the centralized architecture. What would be the recommendation for a system where each DMA will have 3 Cassandra nodes associated? Will replication factor 3 still be convenient?