"In the Land of the Sizing, the One-Partition Kafka Topic is King" by Ricardo Ferreira

Sdílet
Vložit
  • čas přidán 6. 09. 2024
  • Every technology has that key concept that people struggle to understand. With databases, is which join to use for fetching data from multiple tables. With containers, is which storage type to use depending on the workload. With Apache Kafka, the absolute winner is how many partitions to set for a topic. Sizing the number of partitions correctly is a hot topic for Kafka practitioners since doing it wrong affects other aspects of the system, such as consistency, concurrency, and durability. Worse, it also affects how much load Kafka can handle. This talk will peel off the concept of partitions and, using a what-if presentation style, explain what happens with the other aspects of the system given a number. It will highlight the consequences of poor decisions and whether you will be able to recover from them.
    Ricardo Ferreira
    Senior Developer Advocate
    @riferrei
    Ricardo is Senior Developer Advocate at AWS, working in the developer relations team for North America. With +20 years of experience, he may have learned a thing or two about distributed systems, messaging, fast data analytics, databases, and observability. Before AWS, he worked for software vendors like Elastic, Confluent, and Oracle. Ricardo is well known for his remarkable ability to explain complex topics. He cunningly breaks them down into bite-sized pieces until anyone can understand. While not working, he loves barbecuing in his backyard with his family and friends, where he finally gets the chance to talk about anything unrelated to computers. He currently lives in North Carolina, USA, with his wife and son. Follow Ricardo on Twitter: @riferrei
    ----- Sponsored by: -----
    Stream is the # 1 Chat API for custom messaging apps. Activate your free 30-day trial to explore Stream Chat. gstrm.io/tsl

Komentáře • 7

  • @carlosbenicio1244
    @carlosbenicio1244 Před rokem +4

    Wow, this is by far the best presentation I've seen about Kafka partitions. Very well explained.

  • @espetinhodekafka
    @espetinhodekafka Před rokem +1

    Really Nice!

  • @GrigorySapunov
    @GrigorySapunov Před rokem +1

    Great talk, thank you Ricardo!

  • @jackgenewtf
    @jackgenewtf Před rokem +1

    Can someone help me understand the maths that led to the information on 9:32? By my calculations - 250MB/hr -> 1GB (or 1 segment) every 4 hours -> 16GB (or 16 segment) in 64 hours.
    At 2 file handles per segment - I count 32 file handles, not 1024?

    • @riferrei
      @riferrei Před rokem +1

      That was a great catch Jack Leow. 👏🏻 That quiz got messed up because of a mistake in the previous slide. Where it says "writes data every hour" it should be "writes data every minute."

  • @IDisposable
    @IDisposable Před rokem +1

    For slide at 14:09 czcams.com/video/fMISi0mJ51g/video.html the class is actually CooperativeStickyAssignor (slide is missing an o)

    • @riferrei
      @riferrei Před rokem

      Thank you Marc. I will get the slides fixed. 👍🏻