What is MQTT LWT behavior depending on how TCP disconnects?

I’m noticing some unexpected behavior on our EMQX broker (5.0.19) and am looking for some clarification. We have a large number of MQTT 3.1.1 clients who make use of last will and testament (LWT) messages. These are the messages they setup when they connect, and are supposed to be auto-sent by the broker when the client disconnects.

We’re seeing many occasions where the clients are losing their connection and reconnecting, and the LWT from the prior connection is not sent. It’s very likely that these particular connections with this behavior are being killed silently by a network device in the path (i.e. NAT or load balancer) which is not sending a TCP reset or close message to either the client or the broker when it does it.

On an older (3.x) EMQX broker, we always saw the LWT messages from these same clients over the same network paths.

My question is what exactly triggers the LWT to be sent? Can it be sent for a new connection re-using the same client ID and forcing the old one offline? Is it only triggered by reception of a TCP reset or close flag? What is the most likely reason we’re not seeing LWT messages for clients that are reconnecting frequently? Is there some way we can get the old behavior back and have them sent on every reconnect?

I eventually dug down into the MQTT specs to find the relevant bits, decided EMQX was in the wrong, and opened an issue on github:

1 Like