Dear Dojo Community members,
At our DMS, we have a SRM booking system which is stored in an Elastic indexing database. The amount of bookings is quite big. As users, we are interested to see the total amount of bookings or the total amount of bookings from a certain start date. With GQI, we are able to retrieve all bookings, but due to the huge amount of bookings it is not performant. We are wondering if there is a more performant way to be able to get the information we want.
Can someone help us with this, please?
Hi Cian Hao,
The Get Bookings GQI query has been optimized, since DataMiner 10.4.2, to specifically count the number of bookings that were filtered. The count aggregator should be applied the ID column of the booking to be optimized.
hi Joachim, I was able to convert the query in the example using the ConvertQueryToProtoJson web method. So I would expect it can be used in the Data Aggregator. I wasn’t able to test it out unfortunately.
Hi Cian Hao,
Yes, we at Sphinx Squad can help you with that. We developed and published a package on the Catalogue that consists of an Ad Hoc Data Source that gets the count of the bookings, directly queried on the Elastic database.
The package is available for download or deployment via this link.
An explanation of how to use it can be found in the Readme of the originating Github repository.
A screenshot is inserted inline.
In case you do not know the URL of one of your elastic nodes, you can find this in C:\Skyline Dataminer\db.xml.
Kind regards,
Joachim
hi Joachim
The ad hoc data source seems to read the booking data from elastic instead of only counting it. The Elastic search/OpenSearch info is also stored in the low-code app or dashboard I believe.
Is this data source suitable for production use?
Hi Peter,
Indeed. This data source is getting the booking data, directly queried from Elastic. The reason behind this is that we need to have the data to be able to compare the start date and end date with the one in the input argument.
However, if the start and end date are not important, it could be optimized further, as you would only need a count of the indexes instead of also parsing them to the Booking object. But this is a less generic use case so it was not opted to implement this version in a generic package provided to the community.
This ad hoc data source is used in a production system right now, where the user wanted to create a dashboard with the number of bookings created “all time” and also since a certain date. This production DMS is connected with an Elastic Cluster of three nodes and there are now approximately 8.5 k bookings in total. It takes time but generates a result, which can then be shown on a Dashboard.
With GQI, it’s indeed possible to use the built-in option “Get Bookings” instead of using this ad hoc data source, but on that specific production DMS, this request is timing out and not able to show a count while this Ad Hoc Data Source can.
Kind regards,
Joachim
EDIT: this comment was written before the knowledge of the new optimization published in DM 10.4.2. So what is in this comment, applies to versions prior to 10.4.2. If 10.4.2 or higher, it is more maintainable and cleaner to use the built-in feature by aggregating on the ID column.
Thank you, Peter, for this clarification. To be honest, we did not know of this useful feature that was released in 10.4.2. Very interesting. It’s something that has its use cases in many production systems.
I also assume it can be automated by converting the GQI query into a query that can be used and scheduled via the Data Aggregator.
Kind regards,
Joachim