EMQX v5: Facing error 'Server closed connection without DISCONNECT' on scaling to 3000 connections

Hi everyone,

Am using an android based client to publish to EMQX broker. Details on both:
Android Client: HiveMQ v1.3.0
Broker: EMQX v5.0.9 (3 core nodes, 0 replica nodes)

Upon scaling to almost 3000 connections, I started facing the following issue: com.hivemq.client.mqtt.exceptions.ConnectionClosedException: Server closed connection without DISCONNECT. Does this mean current EMQX setup is not able to support more connections and we need to scale it (i.e. add replica nodes) to be able to support more connections ? Or is there some other issue here ?

Would be great if someone can help here. Thanks in advance.

Hi. I think in order for us to discuss this issue better, you’d better provide the hardware level of the server running EMQX.

For sure in most cases, 3,000 connections are far from the limit of EMQX.

Is it convenient for you to provide the background log of EMQX? You can also check your current max file descriptor limit. I guess you may be limited by it.

Hardware: K8s cluster of three pods(without any hard limit on cpu and memory), and these pods are running on r5.4xlarge machines(16 CPU cores, 128GB memory).
I also checked the FD limit it is 1048576, which i think should be sufficient.
Regarding background log, did you mean EMQX server logs?

Hi,

Unless you’re dealing with a really pathological traffic pattern (i.e. all clients subscribe to #, and simultaneously publish messages to some topic every millisecond, so input traffic is amplified 3000x times), 3000 connections is really nothing for EMQX.

Things to check:

  1. emqx.log
  2. Load balancer and ingress controller: is there any limit set there?

Thanks for the reply,
I am getting lot off timeout errors in emqx logs

mfa: emqx_connection:terminate/2, msg: terminate, peername: <ip>:<port>, reason: {shutdown,keepalive_timeout}

We are using hivemq client with 60 seconds of keep alive time and server side keep alive is disabled