Hi Dojo,
I need to develop a connector which reads and writes to an excel file. I can't use Microsoft COM objects because the customer doesn't have Microsoft Office installed. I've tried multiple NuGet packages, but they need licenses to be used commercially.
Does anyone have a suggestion on how to approach this?
If I remember correctly, multiple connectors that had to work with Excel spreadsheets make use of the OpenXml SDK:
https://www.nuget.org/packages/DocumentFormat.OpenXml
https://learn.microsoft.com/en-us/office/open-xml/open-xml-sdk
https://www.c-sharpcorner.com/article/creating-excel-file-using-openxml/
Not a programing solution but if it suits your use-case you could instead read/write in the OpenDocument Spreadsheet (.ods) format using the OpenOffice SDK which will not require a separate license. Excel can be used to open documents of this format.
Hi Benjamin
I have previously used NPOI in an automation script to create a basic excel. In DIS internally I've also used NPOI for excel. This is a public NuGet package that you can then use. I've never had to read an excel, so is something you'll need to double check, but writing should not give any issues.
Thank you, Michiel. I will certainly have a look at this.
I'm not sure if this is the most straight forward solution, but an xlsx file is nothing more than a zip of different things.
Try to rename your existing xlsx file to a .zip and unzip it. Then check how the content is configured. Probably at some kind you will come across a more normal file (likely an xml) which you can write to with something like a StreamWriter, then rezip the archive and rename it to get it back to a valid xlsx.
Again, not sure if this is the easiest solution given the limitations you have, but it's one of the most basic ways.
Thank you for the answer, Toon. I will probably not do this manually, as the solution mentioned by Pedro already manipulates the XML files which are extracted from the .xlsx file.
That’s why Pedro is the expert 😉
Firstly I would say, do you really have to? What is the use case? Is this the best solution? Just being cautious, because personally I'm not a big fan of that as sometimes this can be a bit of a rabbit hole that you go down into (i.e. an excel sheet is not a database, and it is designed for humans to play around in).
Also not sure if you are referring to writing into an existing excel file with existing content, or being able to deliver data into a file that can be consumed with Excel. Because if the latter would the case, I would simply generate a CSV file with the data. That will open up just fine in Excel and people can then take that forward and do what they want to do.
Hello Ben, in our use case, we’re separating data from a rather large json object into 2 parts. One part is being frequently updated and polled so we left it in the json object. The data which is “static” (kind of) is moved to a excel file and polled less frequently, where a dataminer table and xlsx file are synchronized (changes in one reflect on the other), hence the need for writes to the file.
CSV file isn’t very readable for that purpose, so that’s why we chose excel.
Hi Benjamin – thank you for that clarification and the context of this question. I might still be missing some of the finer details, but it sounds like you are dealing with a data source which is too large to throw all the data into the element. And therefore you want to take a portion of that and store it separately (and currently you are considering an Excel file for that). Some, a portion of that, you still want to synch to a table in the element so that it is available in that element. And you would choose for an Excel file (and probably store it in the docs section of the element), probably because you want the user to have the option to see the full data set if he wants. If I understood everything correctly.
Based on that, I’m wondering if you should consider the use of a Logger Table (https://docs.dataminer.services/develop/devguide/Connector/AdvancedLoggerTables.html). This allows you to store very large amounts of data straight into the data storage of DataMiner and not hold it in memory. This sounds to me like a more robust solution as compared to storing it in a file. You would then also have the option to read back certain data from that, and use it again in the driver (either raw records or aggregated data derived from the raw data, e.g. just to name an example a count of the number of lines in there). There is also a native UI in your element that is available if you have stored data in a logger table, which allows the user to interact with that data (e.g. perform a search for specific records).
Have you looked into a Logger Table as a possible solution?
Yes, you’re right about everything, except for the data source being very large. We have around 800 rows in the table and that number is persistent. I did not know about logger tables, though. I will discuss it with my TAM. Thank you.
Hello Benjamin,
Could OleDBConnection be of any use? I've dealt with Excel files in the past using it and it suited the purpose. Not sure if it would work without having office installed.
If you need an example for this, just answer this one and I can look something up 🙂
Hi João and thank you, I’ve looked it up and it doesn’t require Office being installed. But it wouldn’t suit me well since it isn’t convenient to write queries to update only a couple of cells.
In case someone comes across the same question, I'll leave an update. I ended up using OpenXML for my case. It is well documented by Microsoft itself.
Although it's a bit of a hassle to manipulate data inside the excel, as the package uses internal structure of the excel. For example: cells can have multiple value types (integer, string, date, datetime, etc.), but a particularly interesting one is shared string, where the index of the cell from shared strings table is the actual cell value (e.g. cell A3 has the value 12, where 12 represents 12th entry in the shared strings table).
To modify a cell of type shared string, you need to browse the shared strings table manually to find the value you want to enter, and add it to the end of the table if it doesn't exist.
Besides that, the package is straightforward and easy to use.
Thank you, Pedro. I’m attempting to integrate this right now. I’ll come back with the results 🙂