SigNoz Deployment Issue in Docker Swarm with NFS Share
TLDR sati encounters issues with setting up SigNoz in docker-swarm with an NFS share resulting in otel-collector crash and inaccessible GUI. Nick experienced a similar problem and shared a workaround.
Mar 10, 2023 (6 months ago)
I have a problem with setting up SigNoz in docker-swarm with more than 1 node. (OS is everywhere Ubuntu 22.04.1)
My setup: 3 manager nodes and 1 worker (I started with 1 manager + 1 worker, but the effect was the same so I tried with more managers) plus 1 NFS host that exports the same resource to every swarm node; every swarm host has the nfs share mounted as /data folder (so the signoz repo dir is /data/signoz/[...] visible on every node). Everything is in the same network, every hosts can ping/connect to other and firewall is disabled on every host.
Steps I do:
1. installed docker, docker-compose, initiated docker swarm on manager1 (without any additional flags), joined other swarm nodes as manager2, manager3 and worker1
2. on manager1: git clone signoz repo into mounted /data dir
3. apply changes in docker-compose.yaml (disabled hotrod app, added syslog port into otel-collector service) and otel-collector-config.yaml (disabled collecting docker container logs and enabled syslog), hotrod/docker containers/syslog according to signoz documentation
4. docker stack deploy -c /data/signoz/deploy/docker-swarm/clickhouse-setup/docker-compose.yaml signoz
In this setup:
- otel-collector keeps crashing in a loop with error:
application run finished with error: cannot build pipelines: failed to create "clickhouselogsexporter" exporter, in pipeline "logs": cannot configure clickhouse logs exporter: clickhouse Migrate failed to run, error: migration failed in line 0: RENAME TABLE IF EXISTS signoz_logs.logs_atrribute_keys TO signoz_logs.logs_attribute_keys on CLUSTER cluster; (details: code: 57, message: There was an error on [clickhouse:9000]: Code: 57. DB::Exception: Table signoz_logs.logs_attribute_keys already exists. (TABLE_ALREADY_EXISTS) (version 18.104.22.168 (official build)))
- if somehow otel-collector didn't crash and managed to start properly I cannot access the GUI - it shows a blank page with Loading icon and then after a timeout there's "404 not found"
But if I disable the NFS share on every node aside from manager1 (so the node where I deploy SigNoz) it seems everything works correctly. But then if I disable network on manger1 the services arr in Shutdown status until I restore the network connection to manager1, but this often ends in a state where otel-collector starts to crash in a loop with the same error as above
I don't have any idea what I'm doing wrong here and I'm probably missing something very simple/easy in this setup :(
Mar 19, 2023 (6 months ago)
chi-mon-clickhouse-cluster-1-0-0.chi-mon-clickhouse-cluster-1-0.platform.svc.cluster.local :) show tables from signoz_logs; SHOW TABLES FROM signoz_logs Query id: 40102f01-0443-4845-ac09-ec1f405b442a âânameâââââââââââââââââââââââââââââ atrribute_keys_float64_final_mv â â atrribute_keys_int64_final_mv â â atrribute_keys_string_final_mv â â attribute_keys_float64_final_mv â â attribute_keys_int64_final_mv â â attribute_keys_string_final_mv â â distributed_logs â â distributed_logs_atrribute_keys â â distributed_logs_attribute_keys â â distributed_logs_resource_keys â â distributed_usage â â logs â â logs_atrribute_keys â â logs_attribute_keys â â logs_resource_keys â â resource_keys_string_final_mv â â schema_migrations â â usage â ââââââââââââââââââââââââââââââââââââ Progress: 0.00 rows, 0.00 B (0.00 rows/s., 0.00 B/s.) 18 rows in set. Elapsed: 0.002 sec. chi-mon-clickhouse-cluster-1-0-0.chi-mon-clickhouse-cluster-1-0.platform.svc.cluster.local :)
i ended up running
DROP TABLE signoz.logs_atrribute_keysto progress past the error
Indexed 825 threads (61% resolved)
Issues with SigNoz Setup and Data Persistence in AKS
Vaibhavi experienced issues setting up SigNoz in AKS, and faced data persistence issues after installation. Srikanth provided guidance on ClickHouse version compatibility and resource requirements, helping Vaibhavi troubleshoot and resolve the issue.
Troubleshooting and Adding Log Files to SigNoz POC
Noor has requested help with incorporating log files into their SigNoz POC. In collaboration with vishal-signoz and nitya-signoz, they managed to successfully setup and resolve their issues.
Dashboard Load Issues and Possible Solutions
Al experiences dashboard loading issues since updating to `0.18.1`. Srikanth believes the issue is not version related and suggests examining queries, memory resources, and server distribution for improvements.