Resolving Signoz Query Service Error

TLDR Einav encountered an error related to a missing table in the Signoz service which was preventing data visibility in the UI. Srikanth guided them to restart specific components and drop a database table, which resolved the issue.

Photo of Einav
Einav
Tue, 15 Aug 2023 13:06:36 UTC

Hi everybody I'm getting the following error from signoz query service: `ERROR clickhouseReader/reader.go:4562 code: 60, message: Table signoz_traces.distributed_span_attributes_keys doesn't exist` and no logs or traces are showing in UI. I've restarted all signoz components (including clickhouse) and i dont get this error again. table is still missing but now i see data (logs only) in ui ```chi-signoz-clickhouse-cluster-0-0-0.chi-signoz-clickhouse-cluster-0-0.datricks-services.svc.cluster.local :) select count(*) from signoz_traces.distributed_span_attributes_keys SELECT count(*) FROM signoz_traces.distributed_span_attributes_keys Query id: 4b96bcab-ff86-441c-96d6-7ec911acd07e 0 rows in set. Elapsed: 0.018 sec. Received exception from server (version 22.8.8): Code: 60. DB::Exception: Received from localhost:9000. DB::Exception: Table signoz_traces.distributed_span_attributes_keys doesn't exist. (UNKNOWN_TABLE)``` when trying to access tracs it gives the same error `ERROR clickhouseReader/reader.go:4562 code: 60, message: Table signoz_traces.distributed_span_attributes_keys doesn't exist` signoz version 0.25.5 deployed on k8s using helm chart. any ideas? the table IS missing from clickhouse..

Photo of Srikanth
Srikanth
Wed, 16 Aug 2023 01:04:19 UTC

Can you confirm if the signoz collector was running when this happened?

Photo of Einav
Einav
Wed, 16 Aug 2023 06:01:44 UTC

yes, it was

Photo of Srikanth
Srikanth
Wed, 16 Aug 2023 07:21:26 UTC

Collector runs the migrations. If migration fails it restarts. If the corrector is running then it shouldn’t have been an issue. Are you still facing issue? What table exist in `signoz_traces` db?

Photo of Einav
Einav
Wed, 16 Aug 2023 07:26:01 UTC

these are the pods running ```chi-signoz-clickhouse-cluster-0-0-0 1/1 Running 0 15h signoz-alertmanager-0 1/1 Running 0 7h15m signoz-clickhouse-operator-56fc8d9b5f-cbw96 2/2 Running 0 15h signoz-frontend-6b7977b5fb-hr22s 1/1 Running 0 7h15m signoz-k8s-infra-otel-agent-bndkv 1/1 Running 0 14h signoz-k8s-infra-otel-agent-txt8c 1/1 Running 0 18h signoz-k8s-infra-otel-agent-wz99w 1/1 Running 0 7h15m signoz-k8s-infra-otel-deployment-7bb879d497-rb7sk 1/1 Running 0 14h signoz-otel-collector-5bfd4dd769-bj7p8 1/1 Running 3 (7h12m ago) 7h15m signoz-otel-collector-metrics-7846489b9-bn4zc 1/1 Running 0 11m signoz-query-service-0 1/1 Running 0 7h15m signoz-zookeeper-0 1/1 Running 0 7h15m``` these are the tables in clickhouse: ```SHOW TABLES FROM signoz_traces Query id: 82636ec2-bd1c-4326-85c1-712201e7c36c ┌─name────────────────────────────────────────┐ │ dependency_graph_minutes │ │ dependency_graph_minutes_messaging_calls_mv │ │ dependency_graph_minutes_service_calls_mv │ │ distributed_dependency_graph_minutes │ │ distributed_durationSort │ │ distributed_signoz_error_index_v2 │ │ distributed_signoz_index_v2 │ │ distributed_signoz_spans │ │ distributed_top_level_operations │ │ distributed_usage │ │ distributed_usage_explorer │ │ durationSort │ │ durationSortMV │ │ root_operations │ │ schema_migrations │ │ signoz_error_index │ │ signoz_error_index_v2 │ │ signoz_index │ │ signoz_index_v2 │ │ signoz_spans │ │ sub_root_operations │ │ top_level_operations │ │ usage │ │ usage_explorer │ │ usage_explorer_mv │ └─────────────────────────────────────────────┘```

Photo of Srikanth
Srikanth
Wed, 16 Aug 2023 07:27:57 UTC

That’s unusual. Can you stop and start the signoz-otel-collector again?

Photo of Einav
Einav
Wed, 16 Aug 2023 07:31:56 UTC

just did, see the following logs ```023-08-16T07:28:26.390Z info service/telemetry.go:104 Setting up own telemetry... 2023-08-16T07:28:26.391Z info service/telemetry.go:127 Serving Prometheus metrics {"address": "0.0.0.0:8888", "level": "Basic"} 2023-08-16T07:28:26.391Z info [email protected]/exporter.go:275 Stability level of component is undefined {"kind": "exporter", "data_type": "traces", "name": "clickhousetraces"} 2023-08-16T07:28:26.600Z info clickhousetracesexporter/clickhouse_factory.go:141 View does not exist, skipping patch {"kind": "exporter", "data_type": "traces", "name": "clickhousetraces", "table": "dependency_graph_minutes_db_calls_mv"} 2023-08-16T07:28:26.600Z info clickhousetracesexporter/clickhouse_factory.go:115 Running migrations from path: {"kind": "exporter", "data_type": "traces", "name": "clickhousetraces", "test": "/migrations"} 2023-08-16T07:28:26.611Z info clickhousetracesexporter/clickhouse_factory.go:127 Clickhouse Migrate finished {"kind": "exporter", "data_type": "traces", "name": "clickhousetraces", "error": "Dirty database version 19. Fix and force version."} 2023-08-16T07:28:27.507Z info [email protected]/exporter.go:275 Stability level of component is undefined {"kind": "exporter", "data_type": "metrics", "name": "clickhousemetricswrite"} time="2023-08-16T07:28:27Z" level=info msg="Executing:\nCREATE DATABASE IF NOT EXISTS signoz_metrics ON CLUSTER cluster\n" component=clickhouse time="2023-08-16T07:28:27Z" level=info msg="Executing:\nCREATE TABLE IF NOT EXISTS signoz_metrics.samples_v2 ON CLUSTER cluster (\n\t\t\tmetric_name LowCardinality(String),\n\t\t\tfingerprint UInt64 Codec(DoubleDelta, LZ4),\n\t\t\ttimestamp_ms Int64 Codec(DoubleDelta, LZ4),\n\t\t\tvalue Float64 Codec(Gorilla, LZ4)\n\t\t)\n\t\tENGINE = MergeTree\n\t\t\tPARTITION BY toDate(timestamp_ms / 1000)\n\t\t\tORDER BY (metric_name, fingerprint, timestamp_ms)\n\t\t\tTTL toDateTime(timestamp_ms/1000) + INTERVAL 2592000 SECOND DELETE;\n" component=clickhouse time="2023-08-16T07:28:27Z" level=info msg="Executing:\nCREATE TABLE IF NOT EXISTS signoz_metrics.distributed_samples_v2 ON CLUSTER cluster AS signoz_metrics.samples_v2 ENGINE = Distributed(\"cluster\", \"signoz_metrics\", samples_v2, cityHash64(metric_name, fingerprint));\n" component=clickhouse time="2023-08-16T07:28:28Z" level=info msg="Executing:\nALTER TABLE signoz_metrics.samples_v2 ON CLUSTER cluster MODIFY SETTING ttl_only_drop_parts = 1;\n" component=clickhouse time="2023-08-16T07:28:28Z" level=info msg="Executing:\nSET allow_experimental_object_type = 1\n" component=clickhouse time="2023-08-16T07:28:28Z" level=info msg="Executing:\nCREATE TABLE IF NOT EXISTS signoz_metrics.time_series_v2 ON CLUSTER cluster(\n\t\t\tmetric_name LowCardinality(String),\n\t\t\tfingerprint UInt64 Codec(DoubleDelta, LZ4),\n\t\t\ttimestamp_ms Int64 Codec(DoubleDelta, LZ4),\n\t\t\tlabels String Codec(ZSTD(5))\n\t\t)\n\t\tENGINE = ReplacingMergeTree\n\t\t\tPARTITION BY toDate(timestamp_ms / 1000)\n\t\t\tORDER BY (metric_name, fingerprint)\n\t\t\tTTL toDateTime(timestamp_ms/1000) + INTERVAL 2592000 SECOND DELETE;\n" component=clickhouse time="2023-08-16T07:28:28Z" level=info msg="Executing:\nCREATE TABLE IF NOT EXISTS signoz_metrics.distributed_time_series_v2 ON CLUSTER cluster AS signoz_metrics.time_series_v2 ENGINE = Distributed(\"cluster\", signoz_metrics, time_series_v2, cityHash64(metric_name, fingerprint));\n" component=clickhouse time="2023-08-16T07:28:28Z" level=info msg="Executing:\nALTER TABLE signoz_metrics.time_series_v2 ON CLUSTER cluster DROP COLUMN IF EXISTS labels_object\n" component=clickhouse time="2023-08-16T07:28:28Z" level=info msg="Executing:\nALTER TABLE signoz_metrics.distributed_time_series_v2 ON CLUSTER cluster DROP COLUMN IF EXISTS labels_object\n" component=clickhouse time="2023-08-16T07:28:28Z" level=info msg="Executing:\nALTER TABLE signoz_metrics.time_series_v2 ON CLUSTER cluster MODIFY SETTING ttl_only_drop_parts = 1;\n" component=clickhouse time="2023-08-16T07:28:28Z" level=info msg="Executing:\nALTER TABLE signoz_metrics.time_series_v2 ON CLUSTER cluster ADD COLUMN IF NOT EXISTS temporality LowCardinality(String) DEFAULT 'Unspecified' CODEC(ZSTD(5))\n" component=clickhouse time="2023-08-16T07:28:29Z" level=info msg="Executing:\nALTER TABLE signoz_metrics.distributed_time_series_v2 ON CLUSTER cluster ADD COLUMN IF NOT EXISTS temporality LowCardinality(String) DEFAULT 'Unspecified' CODEC(ZSTD(5))\n" component=clickhouse time="2023-08-16T07:28:29Z" level=info msg="Executing:\nALTER TABLE signoz_metrics.time_series_v2 ON CLUSTER cluster ADD INDEX IF NOT EXISTS temporality_index temporality TYPE SET(3) GRANULARITY 1\n" component=clickhouse time="2023-08-16T07:28:29Z" level=info msg="Executing:\nCREATE TABLE IF NOT EXISTS signoz_metrics.time_series_v3 ON CLUSTER cluster (\n\t\t\tenv LowCardinality(String) DEFAULT 'default',\n\t\t\ttemporality LowCardinality(String) DEFAULT 'Unspecified',\n\t\t\tmetric_name LowCardinality(String),\n\t\t\tfingerprint UInt64 CODEC(Delta, ZSTD),\n\t\t\ttimestamp_ms Int64 CODEC(Delta, ZSTD),\n\t\t\tlabels String CODEC(ZSTD(5))\n\t\t)\n\t\tENGINE = ReplacingMergeTree\n\t\t\tPARTITION BY toDate(timestamp_ms / 1000)\n\t\t\tORDER BY (env, temporality, metric_name, fingerprint);\n" component=clickhouse time="2023-08-16T07:28:29Z" level=info msg="Executing:\nCREATE TABLE IF NOT EXISTS signoz_metrics.distributed_time_series_v3 ON CLUSTER cluster AS signoz_metrics.time_series_v3 ENGINE = Distributed(\"cluster\", signoz_metrics, time_series_v3, cityHash64(env, temporality, metric_name, fingerprint));\n" component=clickhouse time="2023-08-16T07:28:29Z" level=info msg="Shard count changed from 0 to 1. Resetting time series map." component=clickhouse 2023-08-16T07:28:30.485Z info clickhouselogsexporter/exporter.go:455 Running migrations from path: {"kind": "exporter", "data_type": "logs", "name": "clickhouselogsexporter", "test": "/logsmigrations"} 2023-08-16T07:28:30.547Z info clickhouselogsexporter/exporter.go:469 Clickhouse Migrate finished {"kind": "exporter", "data_type": "logs", "name": "clickhouselogsexporter"} 2023-08-16T07:28:30.867Z info processor/processor.go:289 Development component. May change in the future. {"kind": "processor", "name": "logstransform/internal", "pipeline": "logs"} 2023-08-16T07:28:30.868Z info kube/client.go:97 k8s filtering {"kind": "processor", "name": "k8sattributes", "pipeline": "metrics/internal", "labelSelector": "", "fieldSelector": "spec.nodeName=ip-10-2-3-81.eu-west-1.compute.internal"} 2023-08-16T07:28:30.868Z warn filesystemscraper/factory.go:60 No `root_path` config set when running in docker environment, will report container filesystem stats. See {"kind": "receiver", "name": "hostmetrics", "data_type": "metrics"} 2023-08-16T07:28:30.868Z info signozspanmetricsprocessor/processor.go:141 Building signozspanmetricsprocessor {"kind": "processor", "name": "signozspanmetrics/prometheus", "pipeline": "traces"} 2023-08-16T07:28:30.887Z info service/service.go:131 Starting signoz-otel-collector... {"Version": "latest", "NumCPU": 8} 2023-08-16T07:28:30.887Z info extensions/extensions.go:30 Starting extensions...```

Photo of Einav
Einav
Wed, 16 Aug 2023 07:32:16 UTC

```2023-08-16T07:28:45.562Z error exporterhelper/queued_retry.go:391 Exporting failed. The error is not retryable. Dropping data. {"kind": "exporter", "data_type": "metrics", "name": "clickhousemetricswrite", "error": "Permanent error: unsupported metric type; Permanent error: unsupported metric type", "errorCauses": [{"error": "Permanent error: unsupported metric type"}, {"error": "Permanent error: unsupported metric type"}], "dropped_items": 512} /home/runner/go/pkg/mod/go.opentelemetry.io/collector/[email protected]/exporterhelper/queued_retry.go:391 /home/runner/go/pkg/mod/go.opentelemetry.io/collector/[email protected]/exporterhelper/metrics.go:125 /home/runner/go/pkg/mod/go.opentelemetry.io/collector/[email protected]/exporterhelper/queued_retry.go:195 /home/runner/go/pkg/mod/go.opentelemetry.io/collector/[email protected]/exporterhelper/internal/bounded_memory_queue.go:47 2023-08-16T07:29:00.648Z error exporterhelper/queued_retry.go:391 Exporting failed. The error is not retryable. Dropping data. {"kind": "exporter", "data_type": "metrics", "name": "clickhousemetricswrite", "error": "Permanent error: unsupported metric type; Permanent error: unsupported metric type", "errorCauses": [{"error": "Permanent error: unsupported metric type"}, {"error": "Permanent error: unsupported metric type"}], "dropped_items": 231} /home/runner/go/pkg/mod/go.opentelemetry.io/collector/[email protected]/exporterhelper/queued_retry.go:391 /home/runner/go/pkg/mod/go.opentelemetry.io/collector/[email protected]/exporterhelper/metrics.go:125 /home/runner/go/pkg/mod/go.opentelemetry.io/collector/[email protected]/exporterhelper/queued_retry.go:195 /home/runner/go/pkg/mod/go.opentelemetry.io/collector/[email protected]/exporterhelper/internal/bounded_memory_queue.go:47 2023-08-16T07:29:45.440Z error exporterhelper/queued_retry.go:391 Exporting failed. The error is not retryable. Dropping data. {"kind": "exporter", "data_type": "metrics", "name": "clickhousemetricswrite", "error": "Permanent error: unsupported metric type; Permanent error: unsupported metric type", "errorCauses": [{"error": "Permanent error: unsupported metric type"}, {"error": "Permanent error: unsupported metric type"}], "dropped_items": 467} /home/runner/go/pkg/mod/go.opentelemetry.io/collector/[email protected]/exporterhelper/queued_retry.go:391 /home/runner/go/pkg/mod/go.opentelemetry.io/collector/[email protected]/exporterhelper/metrics.go:125 /home/runner/go/pkg/mod/go.opentelemetry.io/collector/[email protected]/exporterhelper/queued_retry.go:195 /home/runner/go/pkg/mod/go.opentelemetry.io/collector/[email protected]/exporterhelper/internal/bounded_memory_queue.go:47```

Photo of Srikanth
Srikanth
Wed, 16 Aug 2023 07:35:21 UTC

The migration is successful, and tables should exist. You mentioned you are using `0.25.5` but the collector version is no `0.79.0` in that SigNoz version.

Photo of Einav
Einav
Wed, 16 Aug 2023 07:37:33 UTC

the collector image used: ```Image: ``` helm deployment: ```signoz my-namespace 14 2023-08-16 06:46:27.692361301 +0000 UTC deployed signoz-0.21.5 0.25.5 ```

Photo of Einav
Einav
Wed, 16 Aug 2023 08:00:30 UTC

the table is still missing ```SHOW TABLES FROM signoz_traces Query id: 87b09c81-c120-45a6-8059-9d33ba5c62d1 ┌─name────────────────────────────────────────┐ │ dependency_graph_minutes │ │ dependency_graph_minutes_messaging_calls_mv │ │ dependency_graph_minutes_service_calls_mv │ │ distributed_dependency_graph_minutes │ │ distributed_durationSort │ │ distributed_signoz_error_index_v2 │ │ distributed_signoz_index_v2 │ │ distributed_signoz_spans │ │ distributed_top_level_operations │ │ distributed_usage │ │ distributed_usage_explorer │ │ durationSort │ │ durationSortMV │ │ root_operations │ │ schema_migrations │ │ signoz_error_index │ │ signoz_error_index_v2 │ │ signoz_index │ │ signoz_index_v2 │ │ signoz_spans │ │ sub_root_operations │ │ top_level_operations │ │ usage │ │ usage_explorer │ │ usage_explorer_mv │ └─────────────────────────────────────────────┘ 25 rows in set. Elapsed: 0.001 sec. chi-signoz-clickhouse-cluster-0-0-0.chi-signoz-clickhouse-cluster-0-0.datricks-services.svc.cluster.local :) select * from signoz_traces.distributed_span_attributes_keys SELECT * FROM signoz_traces.distributed_span_attributes_keys Query id: 53855644-1b3b-4e9c-886b-11968d353afc 0 rows in set. Elapsed: 0.017 sec. Received exception from server (version 22.8.8): Code: 60. DB::Exception: Received from localhost:9000. DB::Exception: Table signoz_traces.distributed_span_attributes_keys doesn't exist. (UNKNOWN_TABLE)```

Photo of Einav
Einav
Wed, 16 Aug 2023 08:01:04 UTC

anything else I can check? do you need any other logs\deployment details such as container versions?

Photo of Srikanth
Srikanth
Wed, 16 Aug 2023 08:14:45 UTC

Can you drop the `schema_migrations` and try again?

Photo of Einav
Einav
Wed, 16 Aug 2023 08:19:03 UTC

it worked! :partying_face: ```SELECT count(*) FROM signoz_traces.distributed_span_attributes_keys Query id: 73237327-ff72-46b5-bc82-a2ca8912b399 ┌─count()─┐ │ 204 │ └─────────┘ 1 row in set. Elapsed: 0.001 sec. ```

Photo of Einav
Einav
Wed, 16 Aug 2023 08:19:29 UTC

how did it happen anyway? this is an odd workaround

Photo of Srikanth
Srikanth
Wed, 16 Aug 2023 09:21:30 UTC

not sure

Photo of Einav
Einav
Wed, 16 Aug 2023 10:21:08 UTC

Alright Thanks for your help!