Question

Solved1.94K views11th July 2023alarm storm alarms

8

João Santos [SLC]40 16th November 2020 0 Comments

How many alarms (per second) can a DataMiner system handle?

It is known and available on DataMiner help the amount of concurrent active alarms that a DataMiner Agent and a DataMiner system can handle. It is also available on DataMiner help, information about the Alarm storm protection, and how to configure it.

Although, the question here is tied with the alarming rate on a DataMiner Agent. Is there any mechanism available to predict how many alarms we can ingest in a DMA (DataMiner Agent)? Worth to be mentioned that the system resources might have a direct impact on such capability.

Marieke Goethals [SLC] [DevOps Catalyst] Selected answer as best 11th July 2023

2 Answers

3

Alexander Verriest [SLC] [DevOps Advocate]582 Posted 20th November 2020 0 Comments

We have an internal tool that generates an increasing amount of alarms/s until we see that the queues can’t handle the load anymore (we only measure on Cassandra at this point though). Afterwards the results are returned to the user. The alarms generated are “lightweight”, so do not contain alarm properties etc., so we get a theoretical “maximum” rate at which alarms can be generated without anything blowing up.

On a i7 2.6Ghz, 32GB ram, 64-bit operating system we saw we could generate around 240 alarms/s this way.

Note that it is perfectly fine to generate alarms at a higher rate, as long as it is temporary. This numger only gives the theoretical maximum rate at which alarms can be created continuously.
We use this number to check for regression on our system. I.e, to check if the rate at which we can generate and process alarms does not go down over time when changes are made to the code. We do not care about the value itself, only about evolution over time.

This tool is for now only used within Dodo squad, but we plan on making it public in the future, so you’d be able to check this rate on your own machine.

Marieke Goethals [SLC] [DevOps Catalyst] Selected answer as best 11th July 2023

score 1 · Answer 1 · 2020-11-20T11:00:07+00:00

Looking at the rate you should consider 2 limits

database speed:
On a MySQL system the speed of parsing the queries will be your bottleneck
For a Cassandra setup the speed of your disk will be the limit. e.g. SSD disks will be able to parse a much higher volume of alarms than spinning disks.

database volume:
Depending on where, how and how long you store the alarms this could be a limit as well. For a MySQL system we cap the amount of alarms but for Cassandra we don’t do this since everything is based on a Time To Live.
You have to ensure that the maintenance actions done on Cassandra can still cope with the volume of data.

To have more specifics on this last topic:
The size of an alarm mainly depends on the amount of properties you add to an alarm: taking into account element, service and alarm properties

Currently an average DMA on the field has +- 20 GB of timetrace data (stored for 1 year). This way, if you would calculate the size of one typical alarm on your specific setup you can determine what the rate could be to correspond to an average system.

If a significant larger rate is needed you can always look into more external Cassandra nodes, better hardware, etc but to know the correct solution a detailed analysis is necessary.

DataMiner Alarms Capacity – alarms per second

2 Answers