Realtime Streaming with Data Lakehouse - End to End Data Engineering Project
Vložit
- čas přidán 19. 06. 2024
- In this video you will learn to design, implement and maintain secure, scalable and cost effective lakehouse architectures leveraging Apache Spark, Apache Kafka, Apache Flink, Delta Lake, AWS, and open-source tools. Unlock data's full potential through advanced analytics and machine learning.
Part 1: • Building Data Lakehous...
FULL DATA ENGINEERING COURSE AVAILABLE:
sh.datamasterylab.com/costsaver
Like this video?
Support us: / @codewithyu
Timestamps:
0:00 Setting up Kafka Broker in KRaft Mode
21:30 Setting up Minio
35:30 Producing data into Kafka
39:10 Acquiring Secret and Access Key for S3
59:00 Creating S3 Bucket Event Listener for Lakehouse
1:05:53 Data Preview and Results
1:07:42 Outro
Resources:
CZcams Source Code:
buymeacoffee.com/yusuf.ganiyu...
🌟 Please LIKE ❤️ and SUBSCRIBE for more AMAZING content! 🌟
👦🏻 My Linkedin: / yusuf-ganiyu-b90140107
🚀 X(Twitter): x.com/YusufOGaniyu
📝 Medium: / yusuf.ganiyu
Hashtags:
#dataengineering #bigdata #dataanalytics #realtimeanalytics #streaming, #datalakehouse, #datalake, #datawarehouse, #dataintegration, #datatransformation, #datagovernance, #datasecurity, #apachespark, #apachekafka, #apacheflink, #deltalake, #aws, #opensource, #dataingestion, #structureddata, #unstructureddata, #semi-structureddata, #dataanalysis, #advancedanalytics, #dataarchitecture, #costoptimization, #cloudcomputing, #awscloud - Věda a technologie
incredible! thanks for sharing
You’re welcome!
Your projects are incredible.. congrats.
Thanks!
thank You man ! When are you planning to release the left portion, the data consumption with sparks and flink ? Or if you have already done this, can you share the link here?
Thanks for all this wonderful projects you're the best, but please can you make a video for a project that the data source are log files or csv or something in our local machine that means alot for me espicially with airflow and kafka spark
Hi Yusuf,
Please I have a question concerning the architecture of this Project.
Can I load the data onto a Cassandra database directly from Kafka, without relying on Spark or Flink?
Thanks in advance.
Hi there!
To have your data in Cassandra, you’ll need to have some sort of connector that dumps data into Cassandra as Cassandra in can’t pull data from Kafka.
Hey man, thanks for sharing your knowledge. Do you have the codes used in the video to share?
Yes, sure!
@@CodeWithYu Can you share with us? On Github maybe.
Hello sir, can you please make a end to end project with Apache HUDI please.
The explanation in between your videos are even great to understand what's going on. Could you please make a roadmap to become a data engineer? Thankyou In advance
Surely… I’ll add this to the pipeline
@@CodeWithYu tysm man