#support

Debugging JSON Parsing and Logging Issue with Otel-Collector

TLDR Hannes was struggling to debug log conversion to JSON and attribute parsing in otel-collector. nitya-signoz provided guidance and updates to the processors which eventually resolved the issue.

Powered by Struct AI

4

ack

1

Jul 21, 2023 (4 months ago)
Hannes
Photo of md5-5fd18cb6c0f914cf7a6415b80c54af65
Hannes
10:57 AM
I'm having a hard time trying to figure out how to debug an issue with logs. I'm trying to convert our logging output to JSON and then using a json_parser in the otel-collector to convert it into attributes. An example from the log file is:
{"severity_text":"INFO","time":"2023-07-21T12:46:34.541+02:00","message":"Completed 200 OK in 49ms (Views: 40.5ms | ActiveRecord: 5.0ms | Allocations: 124929)\n\n","operation":"Admin::DepartmentsController#index","span_id":"628ed9fd4fe1ecb0","trace_id":"67b2a6b62c7f983c3801d14923c017ac"}

In SigNoz it shows the body as
{
  "severity_text": "INFO",
  "time": "2023-07-21T12:46:34.541+02:00",
  "message": "Completed 200 OK in 49ms (Views: 40.5ms | ActiveRecord: 5.0ms | Allocations: 124929)",
  "operation": "Admin::DepartmentsController#index",
  "span_id": "628ed9fd4fe1ecb0",
  "trace_id": "67b2a6b62c7f983c3801d14923c017ac"
}

but none of the attributes is parsed out:
Image 1 for I'm having a hard time trying to figure out how to debug an issue with logs. I'm trying to convert our logging output to JSON and then using a `json_parser` in the otel-collector to convert it into attributes. An example from the log file is:
```{"severity_text":"INFO","time":"2023-07-21T12:46:34.541+02:00","message":"Completed 200 OK in 49ms (Views: 40.5ms | ActiveRecord: 5.0ms | Allocations: 124929)\n\n","operation":"Admin::DepartmentsController#index","span_id":"628ed9fd4fe1ecb0","trace_id":"67b2a6b62c7f983c3801d14923c017ac"}```
In SigNoz it shows the body as
```{
  "severity_text": "INFO",
  "time": "2023-07-21T12:46:34.541+02:00",
  "message": "Completed 200 OK in 49ms (Views: 40.5ms | ActiveRecord: 5.0ms | Allocations: 124929)",
  "operation": "Admin::DepartmentsController#index",
  "span_id": "628ed9fd4fe1ecb0",
  "trace_id": "67b2a6b62c7f983c3801d14923c017ac"
}```
but none of the attributes is parsed out:
10:58
Hannes
10:58 AM
I have the following processors defined:
processors:
  logstransform/internal:
    operators:
      - type: json_parser
      - type: trace_parser
        if: '"trace_id" in attributes or "span_id" in attributes'
        trace_id:
          parse_from: attributes.trace_id
        span_id:
          parse_from: attributes.span_id
        output: remove_trace_id
      - type: trace_parser
        if: '"traceId" in attributes or "spanId" in attributes'
        trace_id:
          parse_from: attributes.traceId
        span_id:
          parse_from: attributes.spanId
        output: remove_traceId
      - id: remove_traceId
        type: remove
        if: '"traceId" in attributes'
        field: attributes.traceId
        output: remove_spanId
      - id: remove_spanId
        type: remove
        if: '"spanId" in attributes'
        field: attributes.spanId
      - id: remove_trace_id
        type: remove
        if: '"trace_id" in attributes'
        field: attributes.trace_id
        output: remove_span_id
      - id: remove_span_id
        type: remove
        if: '"span_id" in attributes'
        field: attributes.span_id

10:58
Hannes
10:58 AM
I'd love to know how I can debug this to understand if something is failing with the parsing
Ankit
Photo of md5-dbe7088320fe1d922707613e02f3420d
Ankit
11:13 AM
nitya-signoz please check this out
ack

1

nitya-signoz
Photo of md5-a52b9d6c34f193d9a1ff940024f36f77
nitya-signoz
12:07 PM
Are you using the processor in your logs pipeline? Can you share the entire otel-collector config ?
Hannes
Photo of md5-5fd18cb6c0f914cf7a6415b80c54af65
Hannes
12:08 PM
sure
12:09
Hannes
12:09 PM
receivers:
  syslog:
    tcp:
      listen_address: "0.0.0.0:54527"
    protocol: rfc3164
    location: UTC
  filelog/dockercontainers:
    include: [  "/var/lib/docker/containers/*/*.log" ]
    start_at: end
    include_file_path: true
    include_file_name: false
    operators:
    - type: json_parser
      id: parser-docker
      output: extract_metadata_from_filepath
      timestamp:
        parse_from: attributes.time
        layout: '%Y-%m-%dT%H:%M:%S.%LZ'
    - type: regex_parser
      id: extract_metadata_from_filepath
      regex: '^.*containers/(?P<container_id>[^_]+)/.*log$'
      parse_from: attributes["log.file.path"]
      output: parse_body
    - type: move
      id: parse_body
      from: attributes.log
      to: body
      output: time
    - type: remove
      id: time
      field: attributes.time
  opencensus:
    endpoint: 0.0.0.0:55678
  otlp/spanmetrics:
    protocols:
      grpc:
        endpoint: localhost:12345
  otlp:
    protocols:
      grpc:
        endpoint: 0.0.0.0:4317
      http:
        endpoint: 0.0.0.0:4318
  jaeger:
    protocols:
      grpc:
        endpoint: 0.0.0.0:14250
      thrift_http:
        endpoint: 0.0.0.0:14268
      # thrift_compact:
      #   endpoint: 0.0.0.0:6831
      # thrift_binary:
      #   endpoint: 0.0.0.0:6832
  hostmetrics:
    collection_interval: 30s
    scrapers:
      cpu: {}
      load: {}
      memory: {}
      disk: {}
      filesystem: {}
      network: {}
  prometheus:
    config:
      global:
        scrape_interval: 60s
      scrape_configs:
        # otel-collector internal metrics
        - job_name: otel-collector
          static_configs:
          - targets:
              - localhost:8888
            labels:
              job_name: otel-collector


processors:
  logstransform/internal:
    operators:
      - type: json_parser
      - type: trace_parser
        if: '"trace_id" in attributes or "span_id" in attributes'
        trace_id:
          parse_from: attributes.trace_id
        span_id:
          parse_from: attributes.span_id
        output: remove_trace_id
      - type: trace_parser
        if: '"traceId" in attributes or "spanId" in attributes'
        trace_id:
          parse_from: attributes.traceId
        span_id:
          parse_from: attributes.spanId
        output: remove_traceId
      - id: remove_traceId
        type: remove
        if: '"traceId" in attributes'
        field: attributes.traceId
        output: remove_spanId
      - id: remove_spanId
        type: remove
        if: '"spanId" in attributes'
        field: attributes.spanId
      - id: remove_trace_id
        type: remove
        if: '"trace_id" in attributes'
        field: attributes.trace_id
        output: remove_span_id
      - id: remove_span_id
        type: remove
        if: '"span_id" in attributes'
        field: attributes.span_id
  batch:
    send_batch_size: 10000
    send_batch_max_size: 11000
    timeout: 10s
  signozspanmetrics/prometheus:
    metrics_exporter: prometheus
    latency_histogram_buckets: [100us, 1ms, 2ms, 6ms, 10ms, 50ms, 100ms, 250ms, 500ms, 1000ms, 1400ms, 2000ms, 5s, 10s, 20s, 40s, 60s ]
    dimensions_cache_size: 100000
    dimensions:
      - name: service.namespace
        default: default
      - name: deployment.environment
        default: default
      # This is added to ensure the uniqueness of the timeseries
      # Otherwise, identical timeseries produced by multiple replicas of
      # collectors result in incorrect APM metrics
      - name: 'signoz.collector.id'
  # memory_limiter:
  #   # 80% of maximum memory up to 2G
  #   limit_mib: 1500
  #   # 25% of limit up to 2G
  #   spike_limit_mib: 512
  #   check_interval: 5s
  #
  #   # 50% of the maximum memory
  #   limit_percentage: 50
  #   # 20% of max memory usage spike expected
  #   spike_limit_percentage: 20
  # queued_retry:
  #   num_workers: 4
  #   queue_size: 100
  #   retry_on_failure: true
  resourcedetection:
    # Using OTEL_RESOURCE_ATTRIBUTES envvar, env detector adds custom labels.
    detectors: [env, system] # include ec2 for AWS, gcp for GCP and azure for Azure.
    timeout: 2s

extensions:
  health_check:
    endpoint: 0.0.0.0:13133
  zpages:
    endpoint: 0.0.0.0:55679
  pprof:
    endpoint: 0.0.0.0:1777

exporters:
  clickhousetraces:
    datasource: 
    docker_multi_node_cluster: ${DOCKER_MULTI_NODE_CLUSTER}
    low_cardinal_exception_grouping: ${LOW_CARDINAL_EXCEPTION_GROUPING}
  clickhousemetricswrite:
    endpoint: 
    resource_to_telemetry_conversion:
      enabled: true
  clickhousemetricswrite/prometheus:
    endpoint: 
  prometheus:
    endpoint: 0.0.0.0:8889
  # logging: {}

  clickhouselogsexporter:
    dsn: 
    docker_multi_node_cluster: ${DOCKER_MULTI_NODE_CLUSTER}
    timeout: 5s
    sending_queue:
      queue_size: 100
    retry_on_failure:
      enabled: true
      initial_interval: 5s
      max_interval: 30s
      max_elapsed_time: 300s

service:
  telemetry:
    metrics:
      address: 0.0.0.0:8888
  extensions:
    - health_check
    - zpages
    - pprof
  pipelines:
    traces:
      receivers: [jaeger, otlp]
      processors: [signozspanmetrics/prometheus, batch]
      exporters: [clickhousetraces]
    metrics:
      receivers: [otlp]
      processors: [batch]
      exporters: [clickhousemetricswrite]
    metrics/generic:
      receivers: [hostmetrics]
      processors: [resourcedetection, batch]
      exporters: [clickhousemetricswrite]
    metrics/prometheus:
      receivers: [prometheus]
      processors: [batch]
      exporters: [clickhousemetricswrite/prometheus]
    metrics/spanmetrics:
      receivers: [otlp/spanmetrics]
      exporters: [prometheus]
    logs:
      receivers: [otlp, syslog]
      processors: [logstransform/internal, batch]
      exporters: [clickhouselogsexporter]
12:09
Hannes
12:09 PM
I don't think that's the whole file
12:09
Hannes
12:09 PM
it is, my bad 🙂

1

nitya-signoz
Photo of md5-a52b9d6c34f193d9a1ff940024f36f77
nitya-signoz
12:11 PM
If I am not wrong, you are ingesting these logs through the syslog receiver right ?
Hannes
Photo of md5-5fd18cb6c0f914cf7a6415b80c54af65
Hannes
12:12 PM
yeah, I'm happy to change that though. I was just trying to find the simplest way to make it work. I have a rails app which is easy to log to the system log folders and rsyslog is already available to use
12:12
Hannes
12:12 PM
it just felt like the easiest way 🙂 but I'm happy to do it differently

1

nitya-signoz
Photo of md5-a52b9d6c34f193d9a1ff940024f36f77
nitya-signoz
12:21 PM
Hi please add this processor
logstransform/json_parser:
    operators:
      - type: json_parser
        parse_from: body
        parse_to: attributes
        if: "body matches '^{.*}$'"


add add it to the pipelines
      processors: [logstransform/json_parser, logstransform/internal, batch]

1

12:22
nitya-signoz
12:22 PM
The execution of internal parser is chained so the json parser was not executed
Hannes
Photo of md5-5fd18cb6c0f914cf7a6415b80c54af65
Hannes
12:32 PM
I get a different result now, still not quite what I would've hoped for, but it might just be progress
12:32
Hannes
12:32 PM
Image 1 for
12:32
Hannes
12:32 PM
Why do things like message/operation now going into the attributes_string key ?
nitya-signoz
Photo of md5-a52b9d6c34f193d9a1ff940024f36f77
nitya-signoz
12:33 PM
Trace_id should be parsed.
12:33
nitya-signoz
12:33 PM
Is logstransform/internal still present in the pipeline ?
Hannes
Photo of md5-5fd18cb6c0f914cf7a6415b80c54af65
Hannes
12:34 PM
 processors:
+  logstransform/json_parser:
+    operators:
+      - type: json_parser
+        parse_from: body
+        parse_to: attributes
+        if: "body matches '^{.*}$'"
   logstransform/internal:
     operators:
+      - type: json_parser
       - type: trace_parser
         if: '"trace_id" in attributes or "span_id" in attributes'
         trace_id:
@@ -205,6 +217,6 @@ service:
       receivers: [otlp/spanmetrics]
       exporters: [prometheus]
     logs:
-      receivers: [otlp, filelog/dockercontainers]
-      processors: [logstransform/internal, batch]
-      exporters: [clickhouselogsexporter]
\ No newline at end of file
+      receivers: [otlp, syslog]
+      processors: [logstransform/json_parser, logstransform/internal, batch]
+      exporters: [clickhouselogsexporter]
12:34
Hannes
12:34 PM
that's the diff atm
nitya-signoz
Photo of md5-a52b9d6c34f193d9a1ff940024f36f77
nitya-signoz
12:34 PM
&gt; Why do things like message/operation now going into the attributes_string key ?
It just mentions that it’s an attribute. You can query it directly by the key.
Hannes
Photo of md5-5fd18cb6c0f914cf7a6415b80c54af65
Hannes
12:34 PM
kewl kewl
nitya-signoz
Photo of md5-a52b9d6c34f193d9a1ff940024f36f77
nitya-signoz
12:35 PM
The sample log you gave me, everything is getting parsed as expected. Can you check if you see any kind of error in the otel-collector logs ?
12:40
nitya-signoz
12:40 PM
Able to reproduce it, give me few mins will get back
12:44
nitya-signoz
12:44 PM
Please update both of your processors to this
logstransform/json_parser:
    operators:
      - type: json_parser
        parse_from: body
        parse_to: attributes
        if: "body matches '^{.*}$'"
      - type: severity_parser
        if: "'severity_text' in attributes"
        parse_from: attributes.severity_text
        output: remove_severity_text
      - type: remove
        id: remove_severity_text
        if: "'severity_text' in attributes"
        field: attributes.severity_text
  logstransform/internal:
    operators:
      - type: trace_parser
        if: '"trace_id" in attributes or "span_id" in attributes'
        trace_id:
          parse_from: attributes.trace_id
        span_id:
          parse_from: attributes.span_id
        output: remove_trace_id
      - id: remove_trace_id
        type: remove
        if: '"trace_id" in attributes'
        field: attributes.trace_id
        output: remove_span_id
      - id: remove_span_id
        type: remove
        if: '"span_id" in attributes'
        field: attributes.span_id

And let me know if it works
Hannes
Photo of md5-5fd18cb6c0f914cf7a6415b80c54af65
Hannes
12:45 PM
{"log":"2023-07-21T12:36:12.142Z\u0009error\[email protected]/processor.go:144\u0009processor encountered an issue with the pipeline\u0009{\"kind\": \"processor\", \"name\": \"logstransform/jsonparser\", \"pipeline\": \"logs\", \"error\": \"Unmarshal: there are bytes left after unmarshal, error found in #10 byte of ...|a532fc0\\\"}{\\\"severity|..., bigger context ...|5\\\",\\\"trace_id\\\":\\\"c55dbf84015b06af9aae5d2faa532fc0\\\"}{\\\"severity_text\\\":\\\"DEBUG\\\",\\\"time\\\":\\\"2023-07-21T12:01:4|...\"}\n","stream":"stderr","time":"2023-07-21T12:36:12.143530708Z"}
12:45
Hannes
12:45 PM
found that
12:45
Hannes
12:45 PM
will update them now
12:48
Hannes
12:48 PM
now I get trace ids and span ids 🙂

1

nitya-signoz
Photo of md5-a52b9d6c34f193d9a1ff940024f36f77
nitya-signoz
12:48 PM
This error is different, there are some logs which the json parser is failing to parse. It will by ingested directly without parsing we need to find them and then check what’s wrong with the format.
12:49
nitya-signoz
12:49 PM
Great.
Hannes
Photo of md5-5fd18cb6c0f914cf7a6415b80c54af65
Hannes
12:49 PM
kewl kewl
12:49
Hannes
12:49 PM
Image 1 for
nitya-signoz
Photo of md5-a52b9d6c34f193d9a1ff940024f36f77
nitya-signoz
12:49 PM
Awesome
Hannes
Photo of md5-5fd18cb6c0f914cf7a6415b80c54af65
Hannes
12:49 PM
Thank you very much for your help nitya-signoz
nitya-signoz
Photo of md5-a52b9d6c34f193d9a1ff940024f36f77
nitya-signoz
12:49 PM
Hannes are you using our new explorer page or the old one ?
Hannes
Photo of md5-5fd18cb6c0f914cf7a6415b80c54af65
Hannes
12:50 PM
it's still the old one 🙂 I'll try the new one out. Now that it's working
nitya-signoz
Photo of md5-a52b9d6c34f193d9a1ff940024f36f77
nitya-signoz
12:52 PM
Sure. Since some duplicate attributes might have been created please run this SQL’s on clickhouse so that you don’t face any issue.
docker exec -it clickhouse /bin/bash

clickhouse client

truncate table logs_attribute_keys;

truncate table tag_attributes;

We will be working on making this possible through UI, but for now please run the above commands
Hannes
Photo of md5-5fd18cb6c0f914cf7a6415b80c54af65
Hannes
02:14 PM
will do, thanks nitya-signoz

SigNoz Community

Built with ClickHouse as datastore, SigNoz is an open-source APM to help you find issues in your deployed applications & solve them quickly | Knowledge Base powered by Struct.AI

Indexed 1023 threads (61% resolved)

Join Our Community