Netflix Data Engineering Tech Talks - The Netflix Data Engineering Stack
Vložit
- čas přidán 28. 07. 2024
- Chris Stephens, Data Engineer, Content & Studio and Pedro Duarte, Software Engineer, Consolidated Logging walk engineers new to Netflix through the building blocks of the Netflix Data Engineering stack. Learn more about how batch and streaming data pipelines are built at Netflix.
#netflix
#datascience
#dataengineering
#etl
#bigdata
Great presentation! Thank you for sharing this!
4:56 - why use Iceberg instead of Delta Lake and Hudi?
8:26 - how do data engineers verify quality data? Isn't that the business office's or data science team's responsibility?
10:08 - DE isn't always told when data source context changes/is updated. 100% true.
12:33 - sometimes = always, in my experience. :)
13:42 - why use python? perhaps due to the schedules? wouldn't scala be faster?
17:55 - Janitor sounds like an incredibly helpful tool!
What is High play starts in the example for Context specific Audits @11:30
4:15 whats go table standard s?
Where can I download this slide?
will they open source Maestro like Airbnb/Airflow??
It would be interesting to know why the workflow scheduler is named 'Maestro'. 🤔
Usually used for an eminent composer, conductor, or teacher of music. Given their business segment, it seems to fit.
@@zknarc ah, got it! Perfect naming then. Thanks 🙏
Looks like Airflow implementation
perhaps for a service that orchestrates the workflow, just like a maestro orchestrates a orchestra, Maestro sounds like a perfect name
Hello netflix, i would like to be hired by you guys. I am a slightly below average software engineer. Maybe you guys could let me sweep the floors or something? I can be a FAANG janitor! Would just ask for maybe like $11 an hour plus a salad from the cafeteria maybe.
I look forward to hearing back from you guys.