Uber's Michelangelo: Strategic AI Overhaul and Impact // MLOps podcast

Sdílet
Vložit
  • čas přidán 1. 08. 2024
  • Join us at our first in-person conference on June 25 all about AI Quality: www.aiqualityconference.com/
    Uber's Michelangelo: Strategic AI Overhaul and Impact // MLOps podcast #239 with Demetrios Brinkmann.
    Huge thank you to ‪@WeightsBiases‬ for sponsoring this episode. WandB Free Courses - wandb.me/courses_mlops
    // Abstract
    Uber's Michelangelo platform has evolved significantly through three major phases, enhancing its capabilities from basic ML predictions to sophisticated uses in deep learning and generative AI. Initially, Michelangelo 1.0 faced several challenges such as a lack of deep learning support and inadequate project tiering. To address these issues, Michelangelo 2.0 and subsequently 3.0 introduced improvements like support for Pytorch, enhanced model training, and integration of new technologies like Nvidia’s Triton and Kubernetes. The platform now includes advanced features such as a Genai gateway, robust compliance guardrails, and a system for monitoring model performance to streamline and secure AI operations at Uber.
    // Bio
    At the moment Demetrios is immersing himself in Machine Learning by interviewing experts from around the world in the weekly MLOps.community meetups. Demetrios constantly learns and engages in new activities to get uncomfortable and learn from his mistakes. He tries to bring creativity into every aspect of his life, whether analyzing the best paths forward, overcoming obstacles, or building Lego houses with his daughter.
    // MLOps Jobs board
    mlops.pallet.xyz/jobs
    // MLOps Swag/Merch
    mlops-community.myshopify.com/
    // Related Links
    From Predictive to Generative - How Michelangelo Accelerates Uber’s AI Journey blog post: www.uber.com/en-JP/blog/from-...
    Uber's Michelangelo: www.uber.com/en-JP/blog/miche...
    The Future of Feature Stores and Platforms // Mike Del Balso & Josh Wills // MLOps Podcast # 186: • The Future of Feature ...
    Machine Learning Education at Uber // Melissa Barr & Michael Mui // MLOps Podcast #156: • Machine Learning Educa...
    -------------- ✌️Connect With Us ✌️ ------------
    Join our slack community: go.mlops.community/slack
    Follow us on Twitter: @mlopscommunity
    Sign up for the next meetup: go.mlops.community/register
    Catch all episodes, blogs, newsletters, and more: mlops.community/
    Connect with Demetrios on LinkedIn: / dpbrinkm
    Timestamps:
    [00:00] Uber's Michelangelo platform evolution analyzed in podcast
    [03:51 - 4:50] Weights & Biases Ad
    [05:57] Uber creates Michelangelo to streamline machine learning
    [07:44] Michelangelo platform's tech and flexible system
    [11:49] Uber Michelangelo platform adapted for deep learning
    [16:48] Uber invests in ML training for employees
    [19:08] Explanation of blog content, ML quality metrics
    [22:38] Michelangelo 2.0 prioritizes serving latency and Kubernetes
    [26:30] GenAI gateway manages model routing and costs
    [31:35] ML platform evolution, legacy systems, and maintenance
    [33:22] Team debates maintaining outdated tool or moving on
    [34:41] Please like, share, leave a feedback, and subscribe to our MLOps channels!
    [34:57] Wrap up
  • Věda a technologie

Komentáře • 5

  • @CarlosMatiasdelaTorre
    @CarlosMatiasdelaTorre Před měsícem

    Demetrios, my man, I loved the format. Now I'm off to read the post 😅

    • @MLOps
      @MLOps  Před měsícem

      letsss gooooooo! i am going to start trying to do more of these! thanks for the support!

  • @matthewrice7590
    @matthewrice7590 Před měsícem +1

    Thanks for the overview! Michelangelo seems like quite the feat of engineering. Major kudos to the engineers who designed and built this out.
    Would be awesome to get more insight into how they calculated the trade-off between costs associated with the long-term development/maintenance/management of a custom system like this versus using something off the shelf and fully managed like Sagemaker/VertexAI/etc. Obviously you are going to be paying a premium for something like Sagemaker, but I can't help but be skeptical that going with a custom approach like this would pay-off in the long term considering the significant engineering effort that must go into ongoing development and refinement, especially when considering the immense complexity of a system like this. Maybe I'm just not thinking big enough haha

    • @CarlosMatiasdelaTorre
      @CarlosMatiasdelaTorre Před měsícem

      For somewhat big companies building an internal dev platform makes a lot of sense to avoid vendor lock-in, ensure long term support, abstract services, ensure compliance across geographies, improve security and cost management, etc.
      For smaller companies it may not be the same.

    • @matthewrice7590
      @matthewrice7590 Před měsícem

      @@CarlosMatiasdelaTorre good points!