Building Data Lakehouse from Scratch - End to End Data Engineering Project

Sdílet
Vložit
  • čas přidán 22. 05. 2024
  • In this video you will learn to design, implement and maintain secure, scalable and cost effective lakehouse architectures leveraging Apache Spark, Apache Kafka, Apache Flink, Delta Lake, AWS, and open-source tools. Unlock data's full potential through advanced analytics and machine learning.
    Part 1: • Saving 95% Cloud Cost ...
    FULL DATA ENGINEERING COURSE AVAILABLE:
    sh.datamasterylab.com/costsaver
    Like this video?
    Support us: / @codewithyu
    Timestamps:
    0:00 Setting up Kafka Broker in KRaft Mode
    21:30 Setting up Minio
    35:30 Producing data into Kafka
    39:10 Acquiring Secret and Access Key for S3
    59:00 Creating S3 Bucket Event Listener for Lakehouse
    1:05:53 Data Preview and Results
    1:07:42 Outro
    Resources:
    CZcams Source Code:
    buymeacoffee.com/yusuf.ganiyu...
    🌟 Please LIKE ❤️ and SUBSCRIBE for more AMAZING content! 🌟
    👦🏻 My Linkedin: / yusuf-ganiyu-b90140107
    🚀 X(Twitter): x.com/YusufOGaniyu
    📝 Medium: / yusuf.ganiyu
    Hashtags:
    #dataengineering #bigdata #dataanalytics #realtimeanalytics #streaming, #datalakehouse, #datalake, #datawarehouse, #dataintegration, #datatransformation, #datagovernance, #datasecurity, #apachespark, #apachekafka, #apacheflink, #deltalake, #aws, #opensource, #dataingestion, #structureddata, #unstructureddata, #semi-structureddata, #dataanalysis, #advancedanalytics, #dataarchitecture, #costoptimization, #cloudcomputing, #awscloud
  • Věda a technologie

Komentáře • 10

  • @makarimmuhammad2843
    @makarimmuhammad2843 Před hodinou

    incredible! thanks for sharing

  • @rafaelg8238
    @rafaelg8238 Před 9 dny

    Your projects are incredible.. congrats.

  • @ultrainstinct6715
    @ultrainstinct6715 Před 10 dny

    Hi Yusuf,
    Please I have a question concerning the architecture of this Project.
    Can I load the data onto a Cassandra database directly from Kafka, without relying on Spark or Flink?
    Thanks in advance.

    • @CodeWithYu
      @CodeWithYu  Před 9 dny

      Hi there!
      To have your data in Cassandra, you’ll need to have some sort of connector that dumps data into Cassandra as Cassandra in can’t pull data from Kafka.

  • @manoharlakshmana6171
    @manoharlakshmana6171 Před 2 dny

    Hello sir, can you please make a end to end project with Apache HUDI please.

  • @Fullon2
    @Fullon2 Před 10 dny

    Hey man, thanks for sharing your knowledge. Do you have the codes used in the video to share?

    • @CodeWithYu
      @CodeWithYu  Před 3 dny +1

      Yes, sure!

    • @Fullon2
      @Fullon2 Před 3 dny

      @@CodeWithYu Can you share with us? On Github maybe.