Question

Solved1.40K views16th March 2023database MySQL trending table

2

Ryan Reuss [SLC] [DevOps Member]490 20th April 2022 0 Comments

Hello Dojo,

This specific MYSQL query is from a 17GB trending table and was found in the slow query log file as taking 28min.

2022-04-04 14:29:10.714|161| Database: SLDMADB | time: 1704031ms | query:
DELETE FROM DATAAVG_40701_3217 WHERE
((iStatus < 0 AND dtFirst < ‘2021-04-03 13:36:46’) OR
(iStatus = 5 AND dtFirst < ‘2022-03-04 13:36:46’) OR
(iStatus = 60 AND dtFirst < ‘2021-04-03 13:36:46’) OR
(iStatus = 120 AND dtFirst < ‘2021-04-03 13:36:46’)) AND iId NOT IN (Followed by thousands of parameters here)

This customer had another table that was much larger than this one, but I was not able to find any record of it in these slow query log files. Is there a cut off point where Dataminer will abort a query such as the one above if it takes more than 30 min to complete and any reason this wouldn’t be logged somewhere?

Thank you for any info and kind regards,

Ryan Reuss [SLC] [DevOps Member] Selected answer as best 16th March 2023

1 Answer

score 2 · Answer 1 · 2022-04-21T17:19:36+00:00

Hi Ryan,

As far as I know, any timeouts or slow queries should be logged (SLDatabase[xxx].txt)

The 30-minute query timeout gets applied by DataMiner for MySQL queries and cannot currently be configured differently.

I guess it could very well be that cleaning the larger table is not a problem because the list of “NOT IN” ids is shorter. The slow query is probably more because of the “NOT IN” list than because of the table size.

As for the listed query with the huge “NOT IN” list: in some scenario’s the list of NOT IN values can grow infinitely over time, eventually breaking the cleanup (which then leaves a table that never gets cleaned out)

Reason is that for every trended table cell, DataMiner always keeps at least one point (the oldest) in the database to be able to display stable values properly. This can give problems for rows that only exist temporarily.

If the cleanup fails, a workaround is to make a scheduled task e.g. every month that runs the exact same query but without the NOT IN values. This will delete all the obsolete points and will reduce the list significantly, however small downside is that if you have a point that never changed in a year and actually still is present in the table in Cube this will result in an empty trend graph (or a warning that there is no trending present, depending on your version)

Example of such a query:

DELETE FROM DATAAVG_40701_3217 WHERE
((iStatus < 0 AND dtFirst < ‘2021-04-03 13:36:46’) OR
(iStatus = 5 AND dtFirst < ‘2022-03-04 13:36:46’) OR
(iStatus = 60 AND dtFirst < ‘2021-04-03 13:36:46’) OR
(iStatus = 120 AND dtFirst < ‘2021-04-03 13:36:46’))

(this is no longer a concern on Cassandra as the cleaning logic works differently)

Wouter, am I correct in assuming that when DM is restarted, each and every trended parameter will have an entry added to the database so if DM was restarted say within the last 30 days there would be no risk of any blanks in the database due to deleting the last logged point on a parameter that has not changed since that point in time? Thanks
Hi Jeff, I indeed believe that your assumption is correct and that a new data point will be written on every DMA restart (most likely also on element restart)

MYSQL/Dataminer Slow Query

1 Answer