Question

Solved2.65K views7th July 2020alarm monitoring Microsoft Platform trending

2

Jens Vandewalle [SLC] [DevOps Enabler]9.54K 1st July 2020 0 Comments

For which parameters should alarm monitoring and trending by default be enabled so that this information can be used when an issue (RTE, memory leak…) occurs?
Is there a place where those default templates are stored?

Jens Vandewalle [SLC] [DevOps Enabler] Selected answer as best 7th July 2020

3 Answers

1

Robin Devos [SLC] [DevOps Advocate]2.63K Posted 2nd July 2020 0 Comments

Hi Jens

As mentioned by Michiel already, alarm templates is based on the system and on the preference. He already told you the typical trended parameters used in QA.

In IOC, we also created something similar in the form of a Manual Of Procedure: MOP – Monitoring DataMiner Health

Jens Vandewalle [SLC] [DevOps Enabler] Selected answer as best 7th July 2020

score 4 · Answer 1 · 2020-07-01T11:51:09+00:00

Alarming is a matter of preference and also greatly depends on the capacity and load of a system. Therefore there are no fixed or recommended alarm templates that I’m aware of.

The trending we use for leak and issue detection during quality assurance is:

Performance page
- Commit charge total
- Free Virtual Memory
- Total processor load
- Total threads
Task Manager:
- CPU
- Handles
- Process Pid
- Threads
- VM size
Filters for Task manager items above:
- SL* for all Dataminer Processes
- mysql* if a mysql database is present on the system. This could als be a Cassandra system with a mysql database for e.g. Asset manager.
- prunsrv* for Cassandra db
- Elasticsearch* for Elasticsearch
Additional task manager filters if you also want to monitor clients on the system:
- iexplore*
- *presentationhost*
Disk info (for all disks, or filter on disks used for DataMiner and databases)
- Avg. Disk sec/Transfer
- Disk Usage
- Free space
- Percent busy time

Note that these only monitor the DataMiner related processes. In the past we have occasionally seen memory leaks from other software running on the same system as the DataMiner agent, eventually also causing issues with the DataMiner agent because insufficient memory was available.

score 0 · Answer 2 · 2020-07-01T07:19:57+00:00

This template is also available as part of the SL_SystemHealthCheck protocol package specific designed to work with the protocols designed to detect memory leaks.

Which parameters should at least be monitored & trended in a Microsoft Platform Element?

3 Answers