Hi, I’ve set the CertManager, the Kubernetes operator version 2.1.1 with CRDs and image for the EMQX is 5.0.23. Kubernetes version is 1.22 and storage used is gp3. This initial cluster was formed with 3 Cores (no replicants) and looks good. API version used for Kubernetes resource is v2alpha1.
I am interested to expose only WSS port via an AWS Network Load Balancer. The info found so far is quite scarce. I tried to expose it via the EMQX definition and also tried via separate ingress NLB. Certificate used is generated via AWS Certificate Manager. I’ve created an user using build-in database, authorization is “Publish&Subscribe” to topic “test” (tried even is Superuser). NLB would expose would expose port 8084 (TLS) and port 18083 for the dashboard.
Testing via mqtt client always shows “Unable to connect. Reason: ‘handshake timed out after 10000ms’”. Trying via the dashboard websocket client says “… is Disconnected”.
String tested with mqtt client:
mqtt publish --topic test --message Hello --host nlbaddress.eu-central-1.amazonaws.com --port 8084 -ws -u username -pw password
For the dashboard websocket client, left the standard values, only updated the adress, port to 8084 and added the username and password.
Increasing logging to debug shows nothing, only lots of these messages which seem cluster related:
“2023-04-25T15:37:58.620833+00:00 [info] msg: terminate, mfa: emqx_connection:terminate/2, line: 666, peername: 10.33.22.27:12288, reason: {shutdown,tcp_closed”
What I want to achieve:
- Expose WSS port to devices on 8084 via NLB, prefferably on path /mqtt
- Make use of SSL termination of the Network Load Balancer
- Expose the dashboard on 18083
(- Get metrics via Prometheus and monitor)
Setup is the following:
apiVersion: apps.emqx.io/v2alpha1
kind: EMQX
metadata:
name: emqx
namespace: emqx-operator-system
labels:
app.kubernetes.io/name: emqx
annotations:
service.beta.kubernetes.io/aws-load-balancer-type: "external"
service.beta.kubernetes.io/aws-load-balancer-nlb-target-type: "ip"
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing
service.beta.kubernetes.io/aws-load-balancer-attributes: load_balancing.cross_zone.enabled=true
service.beta.kubernetes.io/aws-load-balancer-target-group-attributes: preserve_client_ip.enabled=true
service.beta.kubernetes.io/aws-load-balancer-ssl-cert: arn:aws:acm:.....certificate here.......
service.beta.kubernetes.io/aws-load-balancer-backend-protocol: tcp
service.beta.kubernetes.io/aws-load-balancer-ssl-ports: 8084,wss
spec:
image: emqx:5.0.23
imagePullPolicy: IfNotPresent
bootstrapConfig: |
node {
cookie = emqxsecretcookie
data_dir = "data"
etc_dir = "etc"
}
cluster {
discovery_strategy = dns
dns {
record_type = srv
name:"emqx-headless.emqx-operator-system.svc.cluster.local"
}
}
dashboard {
listeners.http {
bind: 18083
}
default_username: "admin"
default_password: "somepasshere"
}
listeners.tcp.default {
bind = "0.0.0.0:8083"
max_connections = 1024000
}
sysmon.vm.long_schedule = disabled
coreTemplate:
metadata:
name: emqx-core
labels:
apps.emqx.io/instance: emqx
apps.emqx.io/db-role: core
spec:
volumeClaimTemplates:
storageClassName: gp3-store
resources:
requests:
storage: 1Gi
accessModes:
- ReadWriteOnce
replicas: 3
command:
- "/usr/bin/docker-entrypoint.sh"
args:
- "/opt/emqx/bin/emqx"
- "foreground"
ports:
- containerPort: 8083
podSecurityContext:
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
fsGroupChangePolicy: Always
containerSecurityContext:
runAsUser: 1000
runAsGroup: 1000
livenessProbe:
httpGet:
path: /status
port: 18083
initialDelaySeconds: 60
periodSeconds: 30
failureThreshold: 3
readinessProbe:
httpGet:
path: /status
port: 18083
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 12
lifecycle:
preStop:
exec:
command: ["/bin/sh","-c","emqx ctl cluster leave"]
replicantTemplate:
metadata:
name: emqx-replicant
labels:
apps.emqx.io/instance: emqx
apps.emqx.io/db-role: replicant
spec:
replicas: 0
command:
- "/usr/bin/docker-entrypoint.sh"
args:
- "/opt/emqx/bin/emqx"
- "foreground"
ports:
- containerPort: 1883
podSecurityContext:
runAsUser: 1000
runAsGroup: 1000
fsGroup: 1000
fsGroupChangePolicy: Always
supplementalGroups:
- 1000
containerSecurityContext:
runAsNonRoot: true
runAsUser: 1000
runAsGroup: 1000
livenessProbe:
httpGet:
path: /status
port: 18083
initialDelaySeconds: 60
periodSeconds: 30
failureThreshold: 10
readinessProbe:
httpGet:
path: /status
port: 18083
initialDelaySeconds: 10
periodSeconds: 5
failureThreshold: 30
lifecycle:
preStop:
exec:
command: ["/bin/sh","-c","emqx ctl cluster leave"]
dashboardServiceTemplate:
metadata:
name: emqx-dashboard
spec:
selector:
apps.emqx.io/db-role: core
ports:
- name: "dashboard-listeners-http-bind"
protocol: TCP
port: 18083
targetPort: 18083
listenersServiceTemplate:
spec:
type: LoadBalancer
ports:
- name: "wss"
protocol: TCP
port: 8084
targetPort: 8083
The targets appear as healthy and there is no NLB ip restriction (open to 0.0.0.0). What could be the reason the handshake fails / I get disconnected via dashboard tester? Did I get something wrong? Thanks!