How to use ClickHouse to store event data for 100M+ users aka the PostHog Event Mansion

Sdílet
Vložit
  • čas přidán 6. 08. 2024
  • Join the PostHog engineering team to find out what makes Clickhouse so great, what makes it just as painful, and how we turned Clickhouse into the event mansion for our open source product analytics platform.
    Follow our journey online:
    Twitter: / posthog
    LinkedIn: / posthog
    Website: posthog.com
    GitHub: github.com/posthog
  • Věda a technologie

Komentáře • 7

  • @kobykilimnik4442
    @kobykilimnik4442 Před rokem +1

    to get withou dedup from the replacing table you need to append the select with final. not only use optimize call.

  • @dn5426
    @dn5426 Před 2 lety +2

    wish we could over-provision our clickhouse setup like that and not worry about query performance or OOMs ._.

  • @mayankpant5471
    @mayankpant5471 Před 3 lety +6

    Really nice video, compiling most of clickhouse features in one video. I have a question how do you guys handle schema evolution of an event or new event introduced in the system. Do you guys flatten the event and store all the events in one big table or each event in seperate table with all of its attributes flatten.
    Hey also this presentation will come handy to quickly go through clickhouse features rather than going to whole documentation. Can you please attach the link to presentation too😌

    • @stefankoning
      @stefankoning Před 3 lety +1

      Interesting questions indeed. @PostHog, any comments? And could you share the presentation?

    • @JamesGreenhillFalcon
      @JamesGreenhillFalcon Před 2 lety

      Thank you! We had a lot of fun putting it together. Schema evolution is an interesting problem that we haven't hit our final solution yet with. Currently all events go into one large events table and we pull out values from the properties json into materialized columns based on query patterns. This actually works super well. We have some other optimizations in the pipeline for making things even faster and avoiding joins altogether in some instances.
      Slides here: docs.google.com/presentation/d/1i4QSK4DWHhccrIds-XLvgTvFY3Mu_Eiwg_XVOhYuL3E/edit?usp=sharing

  • @muhammadsaefulramdan2691
    @muhammadsaefulramdan2691 Před 10 měsíci

    why size volume docker from clickhouse always increasing ?

  • @anuraggautam77
    @anuraggautam77 Před rokem

    I hope this message finds you well. My name is Anurag Gautam, and I am reaching out to you for some guidance regarding Posthog integration with our own hosted ClickHouse database.
    As we continue to explore the capabilities of Posthog for our organization, we have encountered some challenges with integrating it with our own hosted ClickHouse database. I am reaching out to seek your expertise and guidance on this matter.
    If possible, I would greatly appreciate it if you could direct me to any tutorial articles or code snippets that could help us integrate Posthog with our own hosted ClickHouse database. Any assistance or guidance you could provide would be greatly appreciated.
    Thank you for your time and consideration. I look forward to hearing back from you soon