我用三個節點設成一個cluster
并且架構在A10 load balance上
5月26號構建完成運行正常
從7月開始發現,
3個節點其中一個節點會丟失大量連接
剛開始以爲是單一節點有狀況
後來發現,是3個節點會輪流發生大量丟失Client 連接數量
例如NODE1:2022-07-21 01:00:00 連接數量為3萬,2022-07-21 01:03 連接數量就掉到700
雖然鏈接會立馬分配給其他節點,但是很好奇爲什麽會發生此狀況,而且沒幾天就頻繁的發生
有時是NODE1, 有時候是NODE2
沒有crashlog
erlang.log 如下
808379539B">>@118.150.186.188:46046 [CM] Failed to discard <44195.15772.2200>: {'EXIT',{noproc,{gen_server,call,[<44195.15772.2200>,discard,infinity]}}}^M
2022-07-21 00:46:14.120 [error] <<"****">>@&&&&:36502 [CM] Failed to discard <44195.22456.2173>: {'EXIT',{noproc,{gen_server,call,[<44195.22456.2173>,discard,infinity]}}}^M
2022-07-21 00:46:14.121 [error] <<"&&&&&">>@&&&:46326 [CM] Failed to discard <44195.4094.2382>: {'EXIT',{noproc,{gen_server,call,[<44195.4094.2382>,discard,infinity]}}}^M
2022-07-21 00:46:14.122 [error] <<"&&&">>@&&&:34116 [CM] Failed to discard <44195.26341.2382>: {'EXIT',{noproc,{gen_server,call,[<44195.26341.2382>,discard,infinity]}}}^M
2022-07-21 00:46:14.123 [error] <<"&&&&">>@&&:41760 [CM] Failed to discard <44195.19416.2381>: {'EXIT',{noproc,{gen_server,call,[<44195.19416.2381>,discard,infinity]}}}^M
2022-07-21 00:46:14.124 [error] <<"&&&">>@&&&:46482 [CM] Failed to discard <44195.2464.2383>: {'EXIT',{noproc,{gen_server,call,[<44195.2464.2383>,discard,infinity]}}}^M
2022-07-21 00:46:14.125 [error] <<"&">>@&&&&:46670 [CM] Failed to discard <44195.25227.2381>: {'EXIT',{noproc,{gen_server,call,[<44195.25227.2381>,discard,infinity]}}}^M
2022-07-21 01:00:25.528 [warning] [SYSMON] long_schedule warning: pid = <0.5242.4734>, info: [{timeout,276},^M
{in,{gen,do_call,4}},^M
{out,{gen,do_call,4}}]^M
[{initial_call,{proc_lib,init_p,5}},^M
{current_function,{gen,do_call,4}},^M
{registered_name,[]},^M
{status,running},^M
{message_queue_len,0},^M
{group_leader,<0.2201.0>},^M
{priority,normal},^M
{trap_exit,false},^M
{reductions,4743},^M
{last_calls,false},^M
{catchlevel,3},^M
{trace,0},^M
{suspending,[]},^M
{sequential_trace_token,[]},^M
{error_handler,error_handler},^M
{memory,68244},^M
{total_heap_size,8370},^M
{heap_size,4185},^M
{stack_size,50},^M
{min_heap_size,233}]^M
[os_mon] cpu supervisor port (cpu_sup): Erlang has closed^M
2022-07-21 01:00:33.590 [warning] [SYSMON] long_schedule warning: pid = <0.12685.4732>, info: [{timeout,258},^M
{in,^M
{inet_tcp_dist,^M
do_accept,7}},^M
{out,{gen,do_call,4}}]^M
undefined^M
2022-07-21 01:02:57.231 [warning] <<"HE SAME CLINETid">>@¥¥¥:34432 [Channel] The PUBCOMP PacketId 4 is not found^M
2022-07-21 01:02:57.593 [warning] <<"THE SAME CLINETid">>@¥¥¥:34432 [Channel] The PUBCOMP PacketId 5 is not found^M
2022-07-21 01:03:59.846 [warning] <<"HE SAME CLINETid">>@¥¥:34432 [Channel] The PUBCOMP PacketId 6 is not found^M
2022-07-21 01:04:07.227 [warning] <<"HE SAME CLINETid">>@$$$:34432 [Channel] The PUBCOMP PacketId 7 is not found^M
2022-07-21 01:04:07.315 [warning] <<"HE SAME CLINETid">>@$$$:34432 [Channel] The PUBCOMP PacketId 8 is not found^M
2022-07-21 01:04:09.235 [warning] <<"HE SAME CLINETid">>@203.190.23.123:34432 [Channel] The PUBCOMP PacketId 9 is not found^M
2022-07-21 01:04:09.633 [warning] <<"HE SAME CLINETid">>@203.190.23.123:34432 [Channel] The PUBCOMP PacketId 10 is not found^M
2022-07-21 01:04:09.633 [warning] <<"HE SAME CLINETid">>@203.190.23.123:34432 [Channel] The PUBCOMP PacketId 11 is not found^M
2022-07-21 01:04:13.827 [warning] <<"HE SAME CLINETid">>@123.110.239.76:39794 [Channel] The PUBCOMP PacketId 1 is not found^M
2022-07-21 01:04:13.117 [warning] [SYSMON] long_schedule warning: pid = <0.2288.0>, info: [{timeout,1225},^M
{in,{gen_server,loop,7}},^M
{out,{gen_server,loop,7}}]^M
[{initial_call,{proc_lib,init_p,5}},^M
{current_function,{emqx_misc,drain_down,2}},^M
{registered_name,emqx_cm},^M
{status,running},^M
{message_queue_len,405},^M
{group_leader,<0.2228.0>},^M