Spark and Machine Learning on Kubernetes in AWS - Hands on webinar

Sdílet
Vložit
  • čas přidán 12. 09. 2024
  • Link to channel for other videos and to subscribe - / aiengineeringlife
    You can find code used in this webinar here - github.com/pra...
    In this webinar we will learn how to run Spark on Kubernetes for machine learning workload. Through the webinar, you will learn
    Why Spark on Kubernetes?
    AWS Kubernetes Services Overview
    Hands-on Demo
    Facebook Prophet Model on Spark
    Dependency Management
    Creating EKS Cluster
    Building Containers
    Running Spark on Kubernetes

Komentáře • 12

  • @AIEngineeringLife
    @AIEngineeringLife  Před 3 lety +3

    Code used in this video can be found here - github.com/pradeep-misra/spark-k8s

  • @vaibhavtarange2515
    @vaibhavtarange2515 Před 3 lety +2

    Nice Demo
    Some questions I have
    1) while job is running can we see Spark UI
    2) Can we submit using Airflow and get the status of job
    3) if we kill the EKS cluster then logs will be stored or it will get erased and if it gets erased then is there anyway to get those logs. e.g Suppose if a job got failed in prod pipeline and I want to check the reason failure and debug. How can I?
    4)How to check memory utilisation of the spark job?
    4)

    • @AIEngineeringLife
      @AIEngineeringLife  Před 3 lety +1

      Vaibhav.. You can do kubectl port forwarding for Spark UI. I think you can use airflow via sparksubmitoperator. I have not tried it though
      Best practice is to have app specific logs stored outside of cluster to debug. You can add volume mount on k8s specifically for logs

  • @sachintripathi9219
    @sachintripathi9219 Před 3 lety +2

    Thanks for this session, much needed.
    Just wondering if it would be possible to have session 2 which goes in-depth and suggest alternatives of each steps depending on the scale and little more depth on what are the things are happening at the backstage :)

    • @AIEngineeringLife
      @AIEngineeringLife  Před 3 lety

      Sachin.. That is going to be a 2 or more hours session then. There is AWS EMR spark k8s launched. Maybe will see if that makes it seamless

  • @ozycozy9706
    @ozycozy9706 Před rokem

    Excellent!

  • @akshayanand6803
    @akshayanand6803 Před 3 lety +3

    Awesome demo👌

    • @akshayanand6803
      @akshayanand6803 Před 3 lety

      Would you please create a Video of deploying the Machine Learning model on spark cluster using PMML files?

    • @AIEngineeringLife
      @AIEngineeringLife  Před 3 lety

      @@akshayanand6803 Can you elaborate.. You want to deploy pmml file on Spark?. What kind of model is it Spark ML model stores as PMML or regular python based models?

    • @akshayanand6803
      @akshayanand6803 Před 3 lety

      @@AIEngineeringLife I know it’s late to respond , my humble apologies , this was for python regular models saved as pmml files and then running on test dataset in the spark enterprise data lake environment. But I truly love the way you bring the knowledge 🙏🏻 my gratitude, learning through your videos

  • @ahmedshehata9522
    @ahmedshehata9522 Před 3 lety

    Great explanation for start but i thought about using spark operator to have further automation which i am doing now. Hope you can give insights... reason is for using prometheus to monitor job and other benefits for further automation. any thoughts thanks again !