Question

Solved1.79K views25th January 2024binary HTTP upload

3

Jelle Hoorne [SLC] [DevOps Member]242 24th January 2024 0 Comments

I’m trying to upload a binary (mp3) file using an HTTP post session.

The file is read-out using File.ReadAllBytes and the content bytes are set on the session data parameter using protocol.SetParameterBinary

To test this upload I’ve created a dummy mp3 file containing some simple binary content (Hex 02 01 00 02 00). Using this file I can easily perform an upload via the Web Interface and compare the request payload with the payload send via protocol communication.

The request via the Web Interface gives the expected result:

However when performing the same call via protocol it seems the payload data is completely different and I’m even losing a byte along the way.

I’m certain the protocol.SetParameterBinary call works correctly and my session content parameter is filled in with the correct binary data (I’ve performed a Set and Get count on the bytes and the result was identical: 5 bytes)

I have a feeling SLPort is changing the content somehow by possible adding some extra encoding to the data, but it would be nice if someone could confirm this?

I’ve implemented and performed the same HTTP upload request directly from QAction (bypassing SLPort) and everything works perfectly, which only confirms my SLPort suspicion.

Am I missing something, or can somebody confirm the above?

Jelle Hoorne [SLC] [DevOps Member] Selected answer as best 25th January 2024

2 Answers

2

Tom Waterbley [SLC] [DevOps Catalyst]9.42K Posted 25th January 2024 5 Comments

Hi Jelle,

When the Content-Type header doesn’t contain a charset, the data is automatically encoded as UTF-8. In this case that’s not desired because you want to send the raw binary data without any conversions. This can be achieved by adding a charset so the Content-Type header.

Example:

<Header key=”Content-Type”>text/plain; charset=ISO-8859-1</Header>

More details can be found here: https://docs.dataminer.services/develop/schemadoc/Protocol/Protocol.HTTP.Session.Connection.Request.Headers.Header.html#remarks

Jeroen Neyt [SLC] [DevOps Advocate] Posted new comment 25th January 2024

Jelle Hoorne [SLC] [DevOps Member] commented 25th January 2024

Hi Tom, Awesome this works!
However based on the release note does this mean that it doesn’t matter which charset I provide, as long as it’s different from UTF-8 SLPort will not do any conversion?

Tom Waterbley [SLC] [DevOps Catalyst] commented 25th January 2024

Glad to hear that it did the trick. I think you can indeed specify any charset to disable the conversion.

Laurens Moutton [SLC] [DevOps Enabler] commented 25th January 2024

Indeed, when looking at the source code I can see when the charset is specified, and not equal to “utf-8”, that it will not do any conversion. The only tricky part is what the receiving side thinks about that as we’re indicating that a charset is being used while it are actually mp3 bytes. Glad to see that it is working in your case.

Tom Waterbley [SLC] [DevOps Catalyst] commented 25th January 2024

It indeed doesn’t really make sense in this case and could indeed potentially cause problems at the receiving side. It would be better to have an additional flag to indicate that the data should be sent as it is.

Jeroen Neyt [SLC] [DevOps Advocate] commented 25th January 2024

The RFC for the octet-stream subtype (https://www.iana.org/assignments/media-types/application/octet-stream) has no registration for a charset parameter. That makes it a lot less likely that webservers will interpret it.

I agree though that it looks odd to define it as such but to this day it is the only workaround to avoid conversion on the payload.

Hi Tom, Awesome this works!
However based on the release note does this mean that it doesn’t matter which charset I provide, as long as it’s different from UTF-8 SLPort will not do any conversion?
Glad to hear that it did the trick. I think you can indeed specify any charset to disable the conversion.
Indeed, when looking at the source code I can see when the charset is specified, and not equal to “utf-8”, that it will not do any conversion. The only tricky part is what the receiving side thinks about that as we’re indicating that a charset is being used while it are actually mp3 bytes. Glad to see that it is working in your case.
It indeed doesn’t really make sense in this case and could indeed potentially cause problems at the receiving side. It would be better to have an additional flag to indicate that the data should be sent as it is.
The RFC for the octet-stream subtype (https://www.iana.org/assignments/media-types/application/octet-stream) has no registration for a charset parameter. That makes it a lot less likely that webservers will interpret it.

I agree though that it looks odd to define it as such but to this day it is the only workaround to avoid conversion on the payload.

score 0 · Answer 1 · 2024-01-25T09:18:57+00:00

Hi Jelle,

Internally there are 2 types of parameters: a double or a string. In this case a string is being used and is interpreted as such. The bytes that are sent are tried to be looked at as characters and that can cause “translation” problems when going back and forth between (unicode) character encodings and their bytes, which is what you’re probably seeing here. Something that might be tried is to make your protocol a unicode protocol, but this is just a long shot and no guarantee at all that this will work.

Regards,

Hi Laurens, thanks for the feedback. Unfortunately the protocol is already defined as ‘unicode’.

HTTP binary data upload

2 Answers