For example, in this set up, netflow data is published to a kafka topic, with the average traffic throughput for 70 devices being around 150 MB/s. On the DataMiner collector side, are there limitations to be aware of?
Hello Bruno,
There are a couple of factors that can affect Kafka consumption speed:
- The most obvious one is network speed of the server.
- The amount of brokers associated to the Kafka Topic configured in the producer. The more brokers there are, the higher throughput allowed.
- The amount of partitions in the Kafka Topic. You are able to create multiple Consumers to retrieve data from the same topic using the same Group ID and that will load balance the consumption process. But the limit of consumers allowed to have concurrent connections is limited by the number of partitions in the topic configured by the producer.
On a current Kafka Consumer setup we are consuming 7MB/s. This is on one Consumer and the Kafka Topic has one Broker.
We have not found any limitations on Dataminer’s side since all it’s doing is opening a stream, retrieving data, and exporting to a file (if using the Generic Kafka Consumer). The only limitations I’m aware of and that have the largest impact are the ones that I listed; to reach the 150MB/s throughput would require tweaking on both the consumer and producer side.
Thanks, Gabriel. From a DataMiner perspective, what would I need to do to cope with higher throughput, like the 150MB/s in the example above?