We have a requirement to increase the MTU size of some of the DMAs in our cluster to process jumbo frame MTU. This will mean that these agents will run on 9000 MTU compared to the standard 1500 running on the other agents in the cluster.
However, we have concerns as to what this can lead to for DataMiner - since we only have one NIC Team for these agents, it will be using the same NIC for jumbo frame MTU and for general SLNet comms etc.
Questions:
- Are there any known issues having agents communicating over SLNet to each other over differing MTU sizes?
- Are there any known issues for other, less obvious, SL Processes due to higher MTU (thinking about SLDataGateway with more data being pulled at once for trending, for example)?
- Any other information we should be aware of when modifying MTU size in a cluster (whether single agent or all agents)?
Edit: Bonus Question: We have one Staging environment where we already have this configuration partially established - whilst this doesn't have the same data rate (and smaller in DMS size) as our Production environment, is there anything I should be monitoring in particular as an overall "health check"?
Hi Jack,
I don't believe this setting will have any impact on the DataMiner software. MTU is a setting in layer 3, and our software is located in the higher layers. SLNet and SLDataGateway might indeed have a connection to other machines, but the OS should handle the TCP/IP stack correctly with the settings you defined.
The only process I would potentially be worried about, is SLPort. This process is going quite low level in terms of communication and packets going in and out. Yet again, it's using standard TCP or UDP connections of the OS, so I think this will also be transparent for SLPort.
In other words, there are no known issues and we are also not expecting any issues when defining a higher MTU. This setting only defines the max size of the packets on the wire, and the OS should handle this correctly. If this is running fine in your staging environment, connectivity is fine, all data comes in, no timeouts, etc. Then I believe this should be fine in production as well.
Bert