Monitoring metrics EMQX in aws

I’m running emqx 4.4.4 cluster on aws ecs fargate.
There was 8 instances, with 2 vCPU and 4GB memory per instance.

a few days ago, some mqtt connection continues disconnected. and some messages are not delivered.

it solved by increases instances 8 to 10.

my question is;
there was no ambious metrics in aws cloudwatch. cpu usage average is below 20%, and memory usage average is 50% below.

what metrics are needs monitoring for instance increasement? (or decreasement)

or is more better increasing vcpu and memory with less instance counts?