#support

Issues with SigNoz on DigitalOcean k8s After Upgrade

TLDR Thomas experienced issues with SigNoz on DigitalOcean k8s after upgrading to v0.19. Prashant identified a PR to address the issue and provided a solution.

Powered by Struct AI
12
4mo
Solved
Join the chat
May 23, 2023 (4 months ago)
Thomas
Photo of md5-4873356d94b3712ef453779f9fabcb2c
Thomas
01:10 AM
Hey! I'm experiencing some issues with signoz running on DigitalOcean k8s. Yesterday I woke up, checked my monitoring dashboard, and saw no data. I checked the k8s dashboard and found some failed services (it said something like not enough memory - 1181 mb used out of 100 mb) and decided that I needed to update from v0.18 to v0.19 anyway, so I did an upgrade. Some services are pending, and my-release-signoz-query-service is failing with Readiness probe failed: Get "<http://10.244.1.34:8080/api/v1/version>": dial tcp 10.244.1.34:8080: connect: connection refused. 1) how can I fix it? (if you need more details, just ask ๐Ÿ™‚) 2) how can I ensure the infra doesn't crash in the future? Thank you.
Prashant
Photo of md5-1899629483c7ab1dccfbee6cc2f637b9
Prashant
02:52 PM
Hi Thomas ๐Ÿ‘‹

pods states as well as logs of failing pods would be helpful.
02:52
Prashant
02:52 PM
Also, did you run the migration script after upgrading to v0.19?
https://signoz.io/docs/operate/migration/upgrade-0.19/
Thomas
Photo of md5-4873356d94b3712ef453779f9fabcb2c
Thomas
07:52 PM
I can't run the migration script since the service it targets is dead:
yakuhito@catstation:/tmp$ kubectl -n platform logs pod/my-release-signoz-query-service-0
2023-05-23T19:50:24.790Z    INFO    version/version.go:43    

SigNoz version   : v0.19.0
Commit SHA-1     : 6e8be3f
Commit timestamp : 2023-05-20T18:20:50Z
Branch           : HEAD
Go version       : go1.18.10

For SigNoz Official Documentation,  visit https://signoz.io/docs
For SigNoz Community Slack,         visit http://signoz.io/slack
For discussions about SigNoz,       visit https://community.signoz.io

Check SigNoz Github repo for license details.
Copyright 2022 SigNoz


2023-05-23T19:50:24.790Z    WARN    query-service/main.go:61    No JWT secret key is specified.
main.main
    /go/src/github.com/signoz/signoz/ee/query-service/main.go:61
runtime.main
    /usr/local/go/src/runtime/proc.go:250
2023-05-23T19:50:26.023Z    INFO    license/manager.go:127    No active license found, defaulting to basic plan
2023-05-23T19:50:26.025Z    INFO    app/server.go:117    Using ClickHouse as datastore ...
ts=2023-05-23T19:50:26.032Z caller=query_logger.go:113 level=error component=activeQueryTracker msg="Failed to create directory for logging active queries"
ts=2023-05-23T19:50:26.034Z caller=engine.go:349 level=debug component="query engine" msg="Lookback delta is zero, setting to default value" value=5m0s
ts=2023-05-23T19:50:26.035Z caller=reader.go:364 level=info msg="Loading configuration file" filename=/root/config/prometheus.yml
ts=2023-05-23T19:50:26.036Z caller=query_logger.go:113 level=error msg="Failed to create directory for logging active queries"
ts=2023-05-23T19:50:26.039Z caller=engine.go:349 level=debug component="promql evaluator" msg="Lookback delta is zero, setting to default value" value=5m0s
2023-05-23T19:50:26.052Z    INFO    alertManager/notifier.go:94    Starting notifier with alert manager:[]
2023-05-23T19:50:26.052Z    INFO    app/server.go:587    rules manager is ready
ts=2023-05-23T19:50:26.053Z caller=reader.go:381 level=info msg="Completed loading of configuration file" filename=/root/config/prometheus.yml
2023-05-23T19:50:26.075Z    DEBUG    rules/apiParams.go:86    postable rule(parsed):%!(EXTRA *rules.PostableRule=&{Exception EXCEPTIONS_BASED_ALERT  threshold_rule 300000000000 0 {"op":"1","target":5,"matchType":"4"} map[details:https://monitoring.fireacademy.io/exceptions severity:warning] map[description:This alert is fired when the defined metric (current value: {{$value}}) crosses the threshold ({{$threshold}}) summary:The rule threshold is set to {{$threshold}}, and the observed metric value is {{$value}}] false https://monitoring.fireacademy.io/alerts/new []  })
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x28 pc=0x1046d20]

goroutine 1 [running]:
, 0x0}, {0x0, 0x0}, {0x0, 0x0}, {0x0, 0x0}, 0x0, 0x0, ...}, ...)
    /go/src/github.com/signoz/signoz/pkg/query-service/rules/apiParams.go:114 +0x4e0

    /go/src/github.com/signoz/signoz/pkg/query-service/rules/apiParams.go:63
?, 0xc000209100?, 0x6f1?})
    /go/src/github.com/signoz/signoz/pkg/query-service/rules/apiParams.go:59 +0x65

    /go/src/github.com/signoz/signoz/pkg/query-service/rules/manager.go:156 +0x16a

    /go/src/github.com/signoz/signoz/pkg/query-service/rules/manager.go:130 +0x25

    /go/src/github.com/signoz/signoz/ee/query-service/app/server.go:454 +0x93
main.main()
    /go/src/github.com/signoz/signoz/ee/query-service/main.go:71 +0x574
Prashant
Photo of md5-1899629483c7ab1dccfbee6cc2f637b9
Prashant
07:58 PM
Srikanth can you please look into this?
07:59
Prashant
07:59 PM
should the migration script have been executed prior to the upgrade for K8s?
Thomas
Photo of md5-4873356d94b3712ef453779f9fabcb2c
Thomas
08:17 PM
The link you provided suggests the order is upgrade -&gt; migration: https://signoz.io/docs/operate/migration/upgrade-0.19/#first-upgrade-to-v019
Prashant
Photo of md5-1899629483c7ab1dccfbee6cc2f637b9
Prashant
11:18 PM
Thomas This is addressed by this PR: https://github.com/SigNoz/charts/pull/233
11:29
Prashant
11:29 PM
You can either wait for the patch release after the PR above gets merged.
11:31
Prashant
11:31 PM
Or you can clone and upgrade locally from the specific git branch.
Thomas
Photo of md5-4873356d94b3712ef453779f9fabcb2c
Thomas
11:42 PM
Ok, thank you!
May 24, 2023 (4 months ago)
Prashant
Photo of md5-1899629483c7ab1dccfbee6cc2f637b9
Prashant
10:41 AM
Thomas Do let us know if the issue has been resolved.