"Turning the database inside out with Apache Samza" by Martin Kleppmann
Vložit
- čas přidán 20. 09. 2014
- Databases are global, shared, mutable state. That's the way it has been since the 1960s, and no amount of NoSQL has changed that. However, most self-respecting developers have got rid of mutable global variables in their code long ago. So why do we tolerate databases as they are?
A more promising model, used in some systems, is to think of a database as an always-growing collection of immutable facts. You can query it at some point in time - but that's still old, imperative style thinking. A more fruitful approach is to take the streams of facts as they come in, and functionally process them in real-time.
This talk introduces Apache Samza, a distributed stream processing framework developed at LinkedIn. At first it looks like yet another tool for computing real-time analytics, but it's more than that. Really it's a surreptitious attempt to take the database architecture we know, and turn it inside out.
At its core is a distributed, durable commit log, implemented by Apache Kafka. Layered on top are simple but powerful tools for joining streams and managing large amounts of data reliably.
What we have to gain from turning the database inside out? Simpler code, better scalability, better robustness, lower latency, and more flexibility for doing interesting things with data. After this talk, you'll see the architecture of your own applications in a completely new light.
Speaker: Martin Kleppmann @martinkl
Martin is committer on Apache Samza (a distributed stream processing framework), software engineer at LinkedIn, and author at O'Reilly (currently writing a book on designing data-intensive applications). He invented the infamous "LinkedIn Intro" email proxy. Previously he co-founded and sold two startups, Rapportive and Go Test It. He is based in Cambridge, UK. - Věda a technologie
This talk is the most important talk in the century about all kind of computer future logic
2021 still awesome! Thank you!
2017 and this is still incredible interesting. Thank you.
amazing speaker. explaining a difficult concept with simplicity. 2024 still interesting !
Martin Kleppmann The God of Distributed Systems! Thanks Strange Loop for sharing this
Still Relevant! Awesome content Martin :)
Brilliant talk, I disagree only with the comment "Kill REST APIs" but do agree with reducing the focus on request/response systems. Req/Res are implementations details of HTTP, REST can work over websockets. REST is a concept for building distributed systems, it is in no way is it limited to APIs or HTTP(req/res). That being said most "REST" APIs are implemented incorrectly since they lack Hypermedia controls in their message structure.
Excellent presentation.
I really enjoy this kind of thinking. Thanks for the talk.
Amazing! Thanks!
really digging these hand written slides
good presentation. very interesting.
I like Martin Kleppmann, he's a very bright person and a good teacher. I disagree though with his statement "Kill REST". In this talk he proposes to use streams over REST but imo this is all use case dependent. Also the idea of a stream is not so new, publish/subscribe communication flow is pretty widely used already, just think about web sockets. Think about an application that doesn't need to be updated about any CRUD operations within the DB in real time (like 95% of applications). Would you still introduce a complex stream based backend over simple REST?
Does anyone know what software was used to draw theses slides? Would be great for university
thanks folks
iPad Pro with pen would suffice :)
MQTT is a publish/subscribe server with open source JavaScript web socket libraries that's been around for a long time. I used it in the public safety sector for officers to subscribe to streams published by the dispatch center. I guess I'm fuzzy on how this differs from that other than fuzzing the lines between the MQTT server and the database it may ride on.
CQRS?
+maverick88NL Totally! However, in upcoming period, I bet that many people will feel uncomfortable by switching from CRUD to CQRS. Worst issues I encountered was the essential separation of write model from read model. Especially, how to fit everything with specific technology. Implementation of CQRS pattern can be ridiculous sometimes. :)
I wonder what Martin Kleppmann thinks of relay and graphql
more like redux ...
So Event Sourcing then?
"Martin Kleppmann - Event Sourcing and Stream Processing at Scale" - czcams.com/video/avi-TZI9t2I/video.html
24:22 "Kappa" Architecture. :)
I wonder if this guy knows Datomic. I think it's exactly what he wants :)
+Matúš Lešťan He mentions this IN THE VIDEO...
+Matúš Lešťan He compares to Datomic at 43:05
+Matúš Lešťan Yes, He has mentioned Datomic in his book Designing Data Intensive Applications, 2nd chapter.
If he wants it, chances are he probably built it.
I read the phrase "When a client reads from a materialized view, it can keep the net‐
work connection open." from Martin's book "Making sense..." and wondered where was that coming from. How a materialized view offers such a feature ?
Kafka streams seems to have killed Samza
databases are so 1970's
Turing machines are so 1940's
Pants are so 1800's
Wheels are so 4000 BC