Troubleshooting K8s Deployment and Suggested Fields

TLDR saajid had issues with k8s deployment suggested fields. nitya-signoz suggested redeploying and shared troubleshooting steps.

Photo of saajid
saajid
Wed, 08 Mar 2023 18:21:53 UTC

Hi, The docs state that the k8s deployment automatically parses container_name and other relevant k8 fields, and I can see them within the log record. Do I have to do another step to make it a "suggested field" or should I be able to query it as is. Side Question: Are there any k8 helm examples for modifying the collector for json parsing specifically? Thanks in advance.

Photo of nitya-signoz
nitya-signoz
Thu, 09 Mar 2023 07:18:02 UTC

I don’t see k8s_container_name in selected or interesting fields, if it’s not present you won’t be able to query it. Are your logs getting ingested? also where is your k8s cluster running can you provide some more info.

Photo of saajid
saajid
Thu, 09 Mar 2023 14:31:17 UTC

Yep, all the logs are coming through as expected, and I can see the `resources_string.k8s_container_name` in the log record. I'm running on eks on aws.

Photo of nitya-signoz
nitya-signoz
Thu, 09 Mar 2023 15:37:52 UTC

If thats the case then these should also appear as fields on the left side of the logs page. can you share the output of the following commands. ```kubectl exec -n platform -it chi-my-release-clickhouse-cluster-0-0-0 -- sh clickhouse client use signoz_logs; select distinct name from distributed_logs_resource_keys;```

Photo of saajid
saajid
Thu, 09 Mar 2023 15:40:03 UTC

```SELECT DISTINCT name FROM distributed_logs_resource_keys Query id: 02a3aa7e-bcc4-4b13-a527-0223530f3dd4 ┌─name────────────────────────┐ │ host_name │ │ os_type │ │ signoz_component │ └─────────────────────────────┘```

Photo of saajid
saajid
Thu, 09 Mar 2023 15:42:04 UTC

I've deployed the helm to another environment with the same setup and it show and is queriable as expected... is it possible a redeployment should solve it?

Photo of nitya-signoz
nitya-signoz
Thu, 09 Mar 2023 15:42:24 UTC

One more question how many shards of clickhouse do you have ?

Photo of saajid
saajid
Thu, 09 Mar 2023 15:46:23 UTC

It should just be the one.

Photo of nitya-signoz
nitya-signoz
Thu, 09 Mar 2023 15:47:56 UTC

Got it then if your logs data is not important you can stop the otel collector pod, delete the `signoz_logs` database and restart the collectors. Else we can schedule a call to fix this. Commands will be same as this `drop database signoz_logs`

Photo of saajid
saajid
Thu, 09 Mar 2023 15:48:50 UTC

Awesome, thank you. I'll give it a go and report back

Photo of saajid
saajid
Thu, 09 Mar 2023 19:00:38 UTC

Reporting back, deleting the database broke things and signoz was unable to recover gracefully, no logs and was unable to get new logs even after rebooting all pods. I did however took everything down and redeployed and things seem to be working as the other environment now. Thanks for your time.

Photo of nitya-signoz
nitya-signoz
Fri, 10 Mar 2023 04:39:32 UTC

That’s strange, did you stop the collectors before deleting the database? And by broke can you share what happened exactly ?

Photo of saajid
saajid
Fri, 10 Mar 2023 14:31:29 UTC

yep, stop everything but the clickhouse pod. And by broke I mean no logs after starting them back up and no errors in either the agents or the collectors.