#support

SigNoz Deployment Problem After Upgrade

TLDR Al had an error in their SigNoz deployment after an upgrade. Prashant suggested various solutions. Eventually, Al successfully upgraded to version signoz-0.19.1 .

Powered by Struct AI

1

1

16
4mo
Solved
Join the chat
Jul 11, 2023 (5 months ago)
Al
Photo of md5-4a36c3bc25964c64330a5747ba443346
Al
09:11 PM
Hi everyone, I've run into a problem with my signoz deployment after upgrading to 0.22.0 . Everything is installed using helm.

REVISION        UPDATED                         STATUS          CHART           APP VERSION     DESCRIPTION                            
11              Wed Jul  5 18:52:30 2023        superseded      signoz-0.17.0   0.21.0          Upgrade complete
12              Tue Jul 11 15:28:08 2023        deployed        signoz-0.18.1   0.22.0          Upgrade complete                           

Suddenly receiving <Error> TCPHandler: Code: 170. DB::Exception: Requested cluster 'cluster' not found.

Looking at clickhouse:

 :) select cluster from system.clusters

SELECT cluster
FROM system.clusters

Query id: d822fc0f-67b6-4157-97a4-d7dde022c6cd

┌─cluster─────────────────────────────────────────┐
│ test_cluster_one_shard_three_replicas_localhost │
│ test_cluster_one_shard_three_replicas_localhost │
│ test_cluster_one_shard_three_replicas_localhost │
│ test_cluster_two_shards                         │
│ test_cluster_two_shards                         │
│ test_cluster_two_shards_internal_replication    │
│ test_cluster_two_shards_internal_replication    │
│ test_cluster_two_shards_localhost               │
│ test_cluster_two_shards_localhost               │
│ test_shard_localhost                            │
│ test_shard_localhost_secure                     │
│ test_unavailable_shard                          │
│ test_unavailable_shard                          │
└─────────────────────────────────────────────────┘


The PVC is mounted:

Filesystem                Size      Used Available Use% Mounted on
/dev/sdd                503.8G     96.2G    407.6G  19% /var/lib/clickhouse

Attaching file with log snippets.


Wouldn't mind some help, unsure how to recover this. Thanks!!
Jul 12, 2023 (5 months ago)
Prashant
Photo of md5-1899629483c7ab1dccfbee6cc2f637b9
Prashant
05:40 AM
are you using external clickhouse with SigNoz?
05:41
Prashant
05:41 AM
There should be a cluster named: cluster
05:41
Prashant
05:41 AM
which seems to be missing in your case.
Al
Photo of md5-4a36c3bc25964c64330a5747ba443346
Al
01:48 PM
Hi Prashant I am not using external clickhouse. I am using the clickhouse instance deployed with signoz.


It was working with version 0.21

┌─cluster─────────────────────────────────────────┐
│ all-replicated                                  │
│ all-sharded                                     │
│ cluster                                         │
│ test_cluster_one_shard_three_replicas_localhost │
│ test_cluster_one_shard_three_replicas_localhost │
│ test_cluster_one_shard_three_replicas_localhost │
│ test_cluster_two_shards                         │
│ test_cluster_two_shards                         │
│ test_cluster_two_shards_internal_replication    │
│ test_cluster_two_shards_internal_replication    │
│ test_cluster_two_shards_localhost               │
│ test_cluster_two_shards_localhost               │
│ test_shard_localhost                            │
│ test_shard_localhost_secure                     │
│ test_unavailable_shard                          │
│ test_unavailable_shard                          │
└─────────────────────────────────────────────────┘


but after upgrading to v0.22 only the test_* clusters are available.
Jul 13, 2023 (4 months ago)
Al
Photo of md5-4a36c3bc25964c64330a5747ba443346
Al
03:10 PM
helm rollback signoz to the previous revision (v0.21.0), restored clickhouse to working condition. Not sure what happened during the upgrade.

Perhaps I can attempt upgrading to v0.22.0 again and see if it was transient issue.
Prashant
Photo of md5-1899629483c7ab1dccfbee6cc2f637b9
Prashant
03:27 PM
I tested in latest chart with v0.22

SELECT cluster
FROM system.clusters

Query id: 70c515db-ac46-4738-8518-3bf84757e645

┌─cluster─────────────────────────────────────────┐
│ all-replicated                                  │
│ all-sharded                                     │
│ cluster                                         │
│ test_cluster_one_shard_three_replicas_localhost │
│ test_cluster_one_shard_three_replicas_localhost │
│ test_cluster_one_shard_three_replicas_localhost │
│ test_cluster_two_shards                         │
│ test_cluster_two_shards                         │
│ test_cluster_two_shards_internal_replication    │
│ test_cluster_two_shards_internal_replication    │
│ test_cluster_two_shards_localhost               │
│ test_cluster_two_shards_localhost               │
│ test_shard_localhost                            │
│ test_shard_localhost_secure                     │
│ test_unavailable_shard                          │
│ test_unavailable_shard                          │
└─────────────────────────────────────────────────┘
03:27
Prashant
03:27 PM
not able to reproduce the issue
03:28
Prashant
03:28 PM
did you update the helm repository prior to upgrading?
helm repo update
Al
Photo of md5-4a36c3bc25964c64330a5747ba443346
Al
03:50 PM
Yes. At the time of the upgrade, chart version 0.18.1 was the latest:

signoz/signoz           0.18.1          0.22.0          SigNoz Observability Platform Helm Chart

Resulting in:

1. You have just deployed SigNoz cluster:

- frontend version: '0.22.0'
- query-service version: '0.22.0'
- alertmanager version: '0.23.1'
- otel-collector version: '0.79.2'
- otel-collector-metrics version: '0.79.2'
06:36
Al
06:36 PM
I'll attempt upgrade again and report back.
Jul 14, 2023 (4 months ago)
Prashant
Photo of md5-1899629483c7ab1dccfbee6cc2f637b9
Prashant
08:22 AM
Al yes, it will be much appreciated.
Jul 18, 2023 (4 months ago)
Al
Photo of md5-4a36c3bc25964c64330a5747ba443346
Al
08:11 PM
Prashant Just attempted upgrading to Chart 0.18.2 and the same issue occurred, clickhouse was unable to load 'cluster'.


REVISION        UPDATED                         STATUS          CHART           APP VERSION     DESCRIPTION   

14              Tue Jul 18 19:10:43 2023        superseded      signoz-0.18.2   0.22.0          Upgrade complete  




The following log entry seems relevant:

2023.07.18 19:19:59.669624 [ 240 ] {} <Error> DDLWorker: Cannot parse DDL task query-0000025438: Cannot parse query or obtain cluster info. Will try to send error status: 371
Code: 371. DB::Exception: DDL task query-0000025438 contains current host chi-signoz-clickhouse-cluster-0-0:9000 in cluster cluster, but there is no such cluster here. (INCONSISTENT_CLUSTER_DEFINITION) (version 22.8.8.3 (official build))
2023.07.18 19:19:59.681107 [ 244 ] {} <Information> DDLWorker: Task query-0000025438 is outdated, deleting it
2023.07.18 19:19:59.684970 [ 240 ] {} <Error> DDLWorker: Cannot parse DDL task query-0000025439: Cannot parse query or obtain cluster info. Will try to send error status: 371


I have rolled back to signoz-0.17.0 0.21.0 and the signoz deployment is functional again.
Prashant
Photo of md5-1899629483c7ab1dccfbee6cc2f637b9
Prashant
09:12 PM
Al Latest version is signoz-0.19.1.

Also, you would be required to run the migration steps: https://signoz.io/docs/operate/migration/upgrade-0.23/

1

Jul 19, 2023 (4 months ago)
Al
Photo of md5-4a36c3bc25964c64330a5747ba443346
Al
06:25 PM
Hi Prashant upgrade to signoz-0.19.1 completed successfully!

1

SigNoz Community

Built with ClickHouse as datastore, SigNoz is an open-source APM to help you find issues in your deployed applications & solve them quickly | Knowledge Base powered by Struct.AI

Indexed 1023 threads (61% resolved)

Join Our Community