Scale and Optimization Problem in the Production Move of Signoz

TLDR Simran had problems with scaling Signoz and tried a distributed shards setup with multiple replicas, but faced issues. Prashant confirmed that multiple replicas are not supported and suggested multiple shards of ClickHouse and multiple replicas of SigNoz Otel-Collector as a solution.

Photo of Simran
Simran
Sat, 09 Sep 2023 10:40:16 UTC

<#C01HWQ1R0BC|support> Recently we moved Signoz to production but faced issue with scale. Hence we tried to moved to distributed shards setup with 2 replicas and 3 shards. But I can see multiple replicas with shards is not supported as mention here - Wanted to check if this is same with the current version of Signoz as well ? Plus any leads towards optimizing Signoz's clickhouse for arounds 600k - 1millions span per minute setup will be helpfull. Based on what we are seeing, signoz collector is facing throttle to write to clickhouse.

Photo of Prashant
Prashant
Sat, 09 Sep 2023 18:36:02 UTC

As mentioned in the shared docs, multiple replicas of ClickHouse cluster are not supported. But only multiple shards.

Photo of Prashant
Prashant
Sat, 09 Sep 2023 18:37:24 UTC

Adding multiple shard of ClickHouse (spread across multiple nodes) and multiple replicas of SigNoz Otel-Collector should help to handle high load.