Realtime Streaming with Data Lakehouse - End to End Data Engineering Project

Sdílet
Vložit
  • čas přidán 19. 06. 2024
  • In this video you will learn to design, implement and maintain secure, scalable and cost effective lakehouse architectures leveraging Apache Spark, Apache Kafka, Apache Flink, Delta Lake, AWS, and open-source tools. Unlock data's full potential through advanced analytics and machine learning.
    Part 1: • Building Data Lakehous...
    FULL DATA ENGINEERING COURSE AVAILABLE:
    sh.datamasterylab.com/costsaver
    Like this video?
    Support us: / @codewithyu
    Timestamps:
    0:00 Setting up Kafka Broker in KRaft Mode
    21:30 Setting up Minio
    35:30 Producing data into Kafka
    39:10 Acquiring Secret and Access Key for S3
    59:00 Creating S3 Bucket Event Listener for Lakehouse
    1:05:53 Data Preview and Results
    1:07:42 Outro
    Resources:
    CZcams Source Code:
    buymeacoffee.com/yusuf.ganiyu...
    🌟 Please LIKE ❤️ and SUBSCRIBE for more AMAZING content! 🌟
    👦🏻 My Linkedin: / yusuf-ganiyu-b90140107
    🚀 X(Twitter): x.com/YusufOGaniyu
    📝 Medium: / yusuf.ganiyu
    Hashtags:
    #dataengineering #bigdata #dataanalytics #realtimeanalytics #streaming, #datalakehouse, #datalake, #datawarehouse, #dataintegration, #datatransformation, #datagovernance, #datasecurity, #apachespark, #apachekafka, #apacheflink, #deltalake, #aws, #opensource, #dataingestion, #structureddata, #unstructureddata, #semi-structureddata, #dataanalysis, #advancedanalytics, #dataarchitecture, #costoptimization, #cloudcomputing, #awscloud
  • Věda a technologie

Komentáře • 16

  • @makarimmuhammad2843
    @makarimmuhammad2843 Před 27 dny +1

    incredible! thanks for sharing

  • @rafaelg8238
    @rafaelg8238 Před měsícem

    Your projects are incredible.. congrats.

  • @pankajchandravanshi8712

    thank You man ! When are you planning to release the left portion, the data consumption with sparks and flink ? Or if you have already done this, can you share the link here?

  • @Abbou-yo6gi
    @Abbou-yo6gi Před 27 dny

    Thanks for all this wonderful projects you're the best, but please can you make a video for a project that the data source are log files or csv or something in our local machine that means alot for me espicially with airflow and kafka spark

  • @ultrainstinct6715
    @ultrainstinct6715 Před měsícem

    Hi Yusuf,
    Please I have a question concerning the architecture of this Project.
    Can I load the data onto a Cassandra database directly from Kafka, without relying on Spark or Flink?
    Thanks in advance.

    • @CodeWithYu
      @CodeWithYu  Před měsícem

      Hi there!
      To have your data in Cassandra, you’ll need to have some sort of connector that dumps data into Cassandra as Cassandra in can’t pull data from Kafka.

  • @Fullon2
    @Fullon2 Před měsícem

    Hey man, thanks for sharing your knowledge. Do you have the codes used in the video to share?

    • @CodeWithYu
      @CodeWithYu  Před měsícem +1

      Yes, sure!

    • @Fullon2
      @Fullon2 Před měsícem

      @@CodeWithYu Can you share with us? On Github maybe.

  • @manoharlakshmana6171
    @manoharlakshmana6171 Před 29 dny

    Hello sir, can you please make a end to end project with Apache HUDI please.

  • @Faire-rs7ph
    @Faire-rs7ph Před 22 dny +1

    The explanation in between your videos are even great to understand what's going on. Could you please make a roadmap to become a data engineer? Thankyou In advance