Hi everyone,
I’m testing the Kafka Consumer protocol (Generic Kafka Consumer) across two DataMiner agents with identical configuration parameters, both meant to consume messages from an AWS MSK (Kafka) cluster and store the consumed JSON data in a local directory.
However, I’m facing a strange issue:
On DMA #1, the consumer connects to the Kafka brokers successfully, consumes messages, and writes the JSON files as expected.
On DMA #2, with the same configuration, it continuously logs:
[<span class="hljs-symbol">thrd:main</span>]: <span class="hljs-link">Cluster connection already in progress: coordinator query </span>[<span class="hljs-symbol">thrd:main</span>]: <span class="hljs-link">Not selecting any broker for cluster connection: still suppressed: no cluster connection </span>
Error: sasl<span class="hljs-emphasis">_ssl://b-1.kafkaqa...:9096/bootstrap: Connect to ipv4#10.xxx.xxx.xxx:9096 failed: Unknown error (after 21043ms in state CONNECT)
</span>
It never reaches a connected state or produces any JSON output file.
We checked the basics:
The IP resolves fine — we can ping the broker IPs directly from the DMA’s command prompt.
Both DMAs are using SASL/SCRAM over SSL on port 9096.
The same credentials and topic are used, and both brokers are accessible via AWS MSK from other tools (like Lambda).
Both elements point to the same dataminer-protocol-feature-tests topic, same directory structure for file output, and identical protocol parameters.
What I’d like to understand
Is there any additional TCP or SSL requirement for Kafka connections beyond ICMP reachability (ping)?
Could there be a local Windows or SSL store dependency that’s missing or outdated on the non-working DMA?
Are there specific librdkafka configuration options or certificates that must be present per hosting agent for SASL_SSL connections to succeed?
Is there a recommended diagnostic log level or Kafka debug flag within the protocol to trace SSL/TLS handshake issues?
Environment summary
Kafka cluster: AWS MSK (SASL/SCRAM-SHA-512, SSL, port 9096)
Protocol: Generic Kafka Consumer / Custom Skyline Kafka Consumer
DM version: (add your version, e.g. 10.4.0.0)
Hosting: Two DMAs (same version), different Windows hosts
Behavior: Works perfectly on one DMA; fails to connect on another
Any insight on what might cause this “connect failed / no cluster connection” behavior when ping and DNS are fine would be greatly appreciated!