TLDR sarthak asks about configuring S3 data retention and if ClickHouse documentation is enough. vishal-signoz shares guide links and confirms ClickHouse will query S3 data, SigNoz handles everything, and using EC2 with vertical scaling is fine for their use case.
You can set retention based on time period like move to s3 by following this doc:
Even in s3 the performance drop is around 30% so you can access s3 data like normal data (with some performance drop)
thanks vishal for the response , so you are saying , suppose if i retain after 7 days data into s3 by configuring then , after that clickhouse will automatically query old data from s3 , actually i also had that doubt that if i need to write some code/query using s3 table engine to get visuals from data stored in s3 .
> so you are saying , suppose if i retain after 7 days data into s3 by configuring then , after that clickhouse will automatically query old data from s3 Yes
thanks bro for clearification
> actually i also had that doubt that if i need to write some code/query using s3 table engine to get visuals from data stored in s3 No, SigNoz handles everything for you
actually i want to deeply understand about clickhouse working and storage pipeline , that's why i m exploring so that i can handle all issues whatever comes while going into production experiment .
so acc to you , clickhouse official documentation is enough right ?
Yes, you can follow this guide to connect to clickhouse:
okay
one more thing , i think i should discuss is that , i m thinking to use ec2 over elastic load balancing for replication but will be keeping vertically scalable rather than sharding after estimating the current scale i need to handle, will it be the good option in terms of best practices .
yes it should be fine
yeah , actually i m deploying it into production as a experiment for application performance analysis
so i was exploring some open source apm , i found this suitable in terms of features , usecases coverage , technical architecture wise
so i started reading more about observebility , opentelem and signoz in detail so that i can setup completely infrastructure in scalable way so as to remove dependency on saas based apm's
now i understood its working , deployed on stage and now trying to get proper grip on all components including best deployment stretagy
sarthak
Wed, 03 May 2023 09:46:04 UTChi , is there a way or some configuration tag in clickhouse-storage.xml to write a rule to move data on s3 based on time period eg : 1 month and empty the clickhouse tables so that it can ingest data periodically without any space issues .