Diving into Kafka Internals with David Jacot
Vložit
- čas přidán 22. 06. 2024
- In this video I talk to David Jacot who works as a Staff Software Engineer at @Confluent and has been a long time Kafka user, committer and PMC member. We covered how Kafka works internally in great depth.
We use Kafka for various use cases and it works great, but going one level below the abstraction and truly understanding the protocols, techniques and algorithms used is a fun ride.
Chapters:
00:00 Kafka Internals with David Jacot
03:33 Defining Kafka
05:16 Kafka Architecture(s)
11:39 Write Path - Producer sending data
18:35 How does replication work?
25:47 How do we track replication progress?
30:42 Failure Modes: Leader fails
38:18 Consumers: Push vs Pull
40:54 Consumers: How does fetch works?
49:03 Consuming number of bytes vs records
50:50 Optimising consumption
01:00:21 Offset management and choosing partitions
01:09:10 Ending notes
I hope you like this episode and more importantly you learnt some amazing techniques Kafka uses to ensure durability, low latency, simplicity and scalability in its architecture.
Do give this episode a like and share it with your network. Also please subscribe to the channel for content like this.
Other playlists:
Realtime streaming systems: • Realtime Streaming Sys...
Software Engineering: • Software Engineering
Distributed systems and databases: • Distributed Systems an...
Modern databases: • Modern Databases
Other episodes:
KsqlDB: • KsqlDb with Matthias Sax
Exactly once semantics: • Kafka - Exactly once s...
David's Linkedin: / davidjacot
our website: www.geeknarrator.com
Cheers,
The GeekNarrator
Great discussion. Pleasent to see how David answers each question in detail with a lot of patience. And like always Kaivalya asks and goes deep into the internals step by step. Apart from technical aspect there is great learning about how to discuss things in deep.
Thanks a lot Rajesh 🙏🏻
Great discussion. Really got to know about the internals within such a short time! Please carry on these deep dives/internals!
Thanks 🙏🏻😊
Thanks for sharing this kind of discussion!
Hey this was quite great, loved the way you ask questions step by step, would really love more videos on topics and internals like these, thanks to both things were quite clear!!
Thank you :) yes more videos coming up.
One of the rarest discussion on Kafka .. Thanks to both of you for the same.
Thank you Shyam 🙏🏻😀
Thanks!!Great discussion!!Nice subtitiles!!Bring lot more discussions like this!
Thanks 🙏🏻
This was a great video, thanks a lot
thanks for running this; would have been nice to dive deeper on one of the topic. A lot of the topics you covered sounded like a user guide more than discussions about internals.
Thanks for the feedback. Yeah its always challenging. This was more a “catch all” type of episode. I have done some specific deep dives on Exactly once semantics and Ksqldb. Let me know if there are specific topics you are interested in.
This was a great video I must say!
Thanks 🙏🏻 😊
Thanks Kaivalya for such a podcasts
Thanks 🙏🏻😀
Awesome Content! Subscribed in a heartbeat, pun intended.
Thank you 🙏🏻😀
Gold content
Thank you 🙏🏻😊
Nice talk and Great initiative.
Question about the write path where Broker writes on the page cache. So that means that if leader node fails, data in OS cache will be lost. Does that means that storage devices being used should have the power supplies that can at least let kernel flush all it relevant caches to devices.
Great discussion. One quick question, the kRaft controller is responsible for storing metadata like Zookeeper or is it just a component that implements a modified version of the RAFT Algorithm and is responsible for group coordination?
Thanks for the question. KRaft follows an event based model which means any state change is already replicated to the nodes, so is the metadata. This avoids any loading of metadata from something like Zookeeper, because metadata is already there. ex: When leadership changes, the new leader already has the metadata records that are committed so no need to fetch from Zookeeper.
Long story short, yes kraft also takes care of the metadat, but isn’t a metadata store but a way to make metadata available in an event based model. I hope this helps.
@@TheGeekNarrator Thanks for the clarification.
can you please publish the diagrams shown in this podcast? Also do you create these diagrams using plantuml?
Great discussion. Although I don't know if it's just me, but word by word highlight in subtitles is little distracting. Without highlight it would be as effective, thats what I feel.
Thanks for the feedback. Its a bit tricky, because some folks have asked me to add subtitles as sometimes accent could become a problem for some folks. CZcams subtitles are not great, so I take the effort to add them. I can ask folks and see what can be done :)
@@TheGeekNarrator Thanks for replying. I liked the subtitles you have added, I meant, highlighting the actual word in subtitle that you are speaking.
But hats off to your effort of adding these subtitles explicitly, it surely seems lot of work. 👍