I am looking to possibly implement my own MySQL trending table cleaning logic to address some of the shortcomings of the standard DM cleaning logic. One idea would be to delete all the duplicate records (Max(ID)) that are outside the TTL window and then change the date timestamp for all the remaining entries to say TTL + day. Then, from this time on it would not be necessary to search then entire table contents older that TTL, just the TTL + a day or 2. This would significantly reduce the time cleaning out Avg table were the TTL setting are close to default, 5min = 30 day and hour and day records = year. This means that normally when DM attempts to clean out this table, while ensuring it doesn't deleted records for datapoints that have not changed, it issues a Select max(iid) for all istaus=5 values older than the TTL ( 30 days in this example ). This requires MySQL to search the entire table contents that are older than the 30 days and the table contains a year or more worth of data for the other istatus values resulting is searches taking too long and timing out so data never gets cleaned out. That is the background. My question is would changing the datetime stamp potential cause issues with either MySQL itself or other DM operations. My concern is that changing the timestamp will logically put the data out of time order with respect to the primary key, the ID.
Thanks in advance for any feedback anyone can provide.
Hi Jeff,
To confirm i understand your idea
with "duplicate max IDs" do you mean all points older than the TTL for a single parameter?
What DataMiner normally does is retrieving all parameters within a certain timestamp which you request in cube and tries to plot those points, however we added an extra check in SLNet to drop points that are in the wrong time stamp order, so if they are still chronical time stamp wise i believe it should work
However i believe in your concept you miss out on the biggest issue with the Max ID that also keeps using Max IDs for rows that no longer exist? this makes the list grow very big with a huge amount of Max IDs over time, which is very harmful for performance on MySQL. So you definitely need to find a way to get rid of those.
I was thinking more in lines of creating a scheduled task to delete ALL trend points older than the TTL
e.g. DELETE * from dataavg_x where date > year
then the only thing you need to look at is how to "refresh" points on time before the TTL expires otherwise you would have an empty graph in cube instead of a flat line
This is mainly your issue:
if you delete the point older than the TTL, it will not show anything in cube and you have an "empty" part where you no longer know the value
There are several ways to "refresh" a point, e.g. a DMA restart refreshes the value, the main issue with this is mainly if you only have real time trending, it's very difficult to refresh a value every day if it doesn't change except for a workaround in the driver but if you would have both RT + avg enabled, it will just take the average
So basically my suggestion is very similar to your idea except i would not change the timestamp but i would "refresh/rewrite" a value e.g. once a week, my initial suspicion is that this would be less difficult and less heavy than analyzing and updating a value, although i believe your idea would work as well
as an alternative you could live with the fact that some older points are gone and doesn't show a line anymore (it highly depends on the type of parameters you monitor) then you can just delete all older points with a scheduled task?
Marlies, thank you for your response. Yes, I believe you understand my idea and the problem that I am looking to solve. It is all associated with the standard DM avg trending DB cleaning logic used to keep at least one entry outside the TTL date range so graphs do not have blank areas for datapoints that have not changed for an extended period of time since the current TTL point. In systems that use the default settings where TTL for 5 min avg is 30 days and the rest are 1 year this is really only an issue with istatus=5 data points. As I am sure you are aware, the cleaning logic uses a Select max(iid) were istatus =5 and date stamp < the 30 day TTL date stamp which requires MySQL search then entire database with date timestamp < 30 days old and it does this every hour. If this search could be limited to only records say a day or two older that the TTL (30 days) this would solve this issue. If I could get DM to refresh all the istaus=5 say once a day say at midnight, same as restarting DM, then the search could be limited as stated above, but I don't have control over that so my idea was to achieve basically a similar result by just changing the timestamp for all the records that would not get deleted due to the Select max(iid) logic. My understanding is that DM use different logic to address the blank trend graphs case differently in cassandra. If that is the case, can you share how that is done? That may give me other ideas and options. Also, my plan will require me to basically disable the standard DM cleaning logic to keep if from running because once again I can't control how DM does this so it will continue to use the max(iid) search logic resulting is full database search. I can do this by changing the TTL values for the protocol type to the max values which my test shows keeps DM from issuing this cleaning logic on element of this protocol type.
Hi Jeff,
I see that this question has been inactive for a long time. Do you still need assistance with this? If not, could you select the answer to indicate that the question no longer needs follow-up?