Using Kafka Broker for Telem Data to Ensure Fault Tolerance

TLDR sarthak inquires about using kafka broker with ClickHouse for fault tolerance. Srikanth and Ankit discuss scale and data flow feasibility. sarthak also requests a circuit-breaking mechanism, which Ankit explains is already in place.

Photo of sarthak
sarthak
Wed, 17 May 2023 04:02:32 UTC

hello everyone , is it recommended to use kafka broker as receiver of telem data to export in clickhouse instead of standard grpc in case we need to handle high scale and keep fault tolerance so as to prevent possible data loss in case storage/clickhouse failure ?

Photo of Srikanth
Srikanth
Thu, 18 May 2023 01:50:01 UTC

How much scale are we talking about? It might be overkill for regular users. Just putting the queue alone doesn’t guarantee the prevention of data loss since exporter will eventually drop the data when ClickHouse is not reachable.

Photo of Ankit
Ankit
Thu, 18 May 2023 04:07:08 UTC

> It might be overkill for regular users. correct. I think the data flow would look like otel-collector => kafka => clickhouse so we expect Kafka to handle bursts in traffic and downtime of clickhouse

Photo of Srikanth
Srikanth
Thu, 18 May 2023 04:55:13 UTC

They mentioned they want to use it as a receiver as a substitute for the gRPC OTLP receiver and then export it to ClickHouse.

Photo of sarthak
sarthak
Thu, 18 May 2023 08:52:04 UTC

ok , so is there a way to implement some circuit breaking mechanism at microservice level keeping transport mechanism to as it is (gRPC) which can be pass with other env variables so that source service does not become down in case signoz backend is completely down as it will be continuously sending telemetry event to signoz , just faced this on my basic setup and testing

Photo of Ankit
Ankit
Thu, 18 May 2023 14:05:20 UTC

source service starts dropping telemetry data if signoz is down. It should not affect application other than it will need more memory to keep a batch after which it starts dropping data. The service will also print logs about being unable to send data to signoz but application should work just fine