Hi,
I have a correlation rule to collect the alarms and an automation script to send the alarms data as an email. It seems to work but with the active alarms but the problem is that it seems to include all the historical alarms from a few weeks ago which had been raised and cleared.
Below is the correlation rule that I have.
And it runs the script and passes the correlation rule id. With the Id, the script collects the alarms by running this code.
private List<int> GetAlarmIds(string correlationRuleId)
{
GetCorrelationStatusMessage msg = new GetCorrelationStatusMessage
{ ID = Guid.Parse(correlationRuleId), };
GetCorrelationStatusResponse response = Engine.SLNet.SendSingleResponseMessage(msg) as GetCorrelationStatusResponse;
var buckets = response.Status.Buckets;
var alarmIds = new List<int>();
for (int i = 0; i < buckets.Length; i++)
{
if (buckets[i].Mode.ToString() != "TrackCollection")
{
continue;
}
for (int j = 0; j < buckets[i].TrackedTrees.Length; j++)
{
alarmIds.Add(Convert.ToInt32(buckets[i].TrackedTrees[j].Split('/')[1]));
}
}
return alarmIds;
}
Could someone please advise me what I am missing here? Please let me know if anything else is needed. Many Thanks.
Hi Paul,
Looks like this is a bug with the "Collect events for ... after first event" rule condition option. For every instance of the action, a tracked bucket remains in memory until DMA restart or rule update, while I would have expected these to have been cleaned up after executing the rule actions.
The code in your automation script is grabbing alarm events out of all these (old) buckets.
Note: You can see all current buckets in the SLNetClientTest tool via Advanced > Correlation > Rule Status.
A proposed workaround could be to have your automation script search for the most recent of these buckets (highest id), as this is the relevant bucket for the active occurrence.
Example code below (using System.Linq)
GetCorrelationStatusResponse response = Engine.SLNet.SendSingleResponseMessage(msg) as GetCorrelationStatusResponse;
var alarmIds = new List<int>();
var bucket = response.Status.Buckets.OrderByDescending(b => b.ActiveMatchInfoID).Where(b => b.Mode.ToString() == "TrackCollection").FirstOrDefault();for (int j = 0; j < bucket.TrackedTrees.Length; j++)
{
alarmIds.Add(Convert.ToInt32(bucket.TrackedTrees[j].Split('/')[1]));
}
Hi Paul. An internal task was created to work on this issue (208818)
Hi Wouter,
may this bug be the cause for the following notices we’ve recently experienced?
“Correlation rule xyz has the maximum of 10000 active buckets. Ignoring new ones.”
In this Correlation Rule we also use the “collect events for…” option.
Hi Nils,
The notice is indeed a consequence of the buckets sticking around in memory and will prevent new buckets from being created (thus breaking the correlation rule, as it will no longer trigger)
For as long as no fix has been provided for this (ref 208818 as above), workarounds could be to clear the in-memory buckets by editing the rule from time to time (e.g. only edit the description) or by restarting the DMA. Another option would be to increase the limit via SLNetClientTest > Advanced > Options > SLNet Options > CorrelationMaxBucketsPerRule. There’s little impact besides increased memory usage for the SLNet process.
Hi Wouter,
thanks for confirming and for the workarounds. Good to hear that this is a known issue.
Thanks Wouter, it was spot on and did fix the problem. Is the bug with the buckets not being cleaned going to be investigated in the future?