MLOps.community
MLOps.community
  • 938
  • 764 798
Visualize - Bringing Structure to Unstructured Data // Markus Stoll // MLOps Podcast #258
Visualize - Bringing Structure to Unstructured Data // MLOps Podcast #258 with Markus Stoll, CTO of Renumics.
A huge thank you to @SAS for their generous support!
// Abstract
This talk is about how data visualization and embeddings can support you in understanding your machine-learning data. We explore methods to structure and visualize unstructured data like text, images, and audio for applications ranging from classification and detection to Retrieval-Augmented Generation. By using tools and techniques like UMAP to reduce data dimensions and visualization tools like Renumics Spotlight, we aim to make data analysis for ML easier. Whether you're dealing with interpretable features, metadata, or embeddings, we'll show you how to use them all together to uncover hidden patterns in multimodal data, evaluate the model performance for data subgroups, and find failure modes of your ML models.
// Bio
Markus Stoll began his career in the industry at Siemens Healthineers, developing software for the Heavy Ion Therapy Center in Heidelberg. He learned about software quality while developing a treatment machine weighing over 600 tons. He earned a Ph.D., focusing on combining biomechanical models with statistical models, through which he learned how challenging it is to bridge the gap between research and practical application in the healthcare domain. Since co-founding Renumics, he has been active in the field of AI for Engineering, e.g., AI for Computer Aided Engineering (CAE), implementing projects, contributing to their open-source library for data exploration for ML datasets (Renumics Spotlight) and writing articles about data visualization.
// MLOps Jobs board
mlops.pallet.xyz/jobs
// MLOps Swag/Merch
mlops-community.myshopify.com/
// Related Links
Website: renumics.com/
MLSecOps Community: community.mlsecops.com/
Blogs: towardsdatascience.com/visualize-your-rag-data-evaluate-your-retrieval-augmented-generation-system-with-ragas-fc2486308557
: medium.com/itnext/how-to-explore-and-visualize-ml-data-for-object-detection-in-images-88e074f46361
--------------- ✌️Connect With Us ✌️ -------------
Join our slack community: go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: mlops.community/
Connect with Demetrios on LinkedIn: www.linkedin.com/in/dpbrinkm/
Connect with Markus on LinkedIn: www.linkedin.com/in/markus-stoll-b39a42138/
Timestamps:
[00:00] Markus' preferred coffee
[00:15] Takeaways
[01:41] Please like, share, leave a review, and subscribe to our MLOps channels!
[01:50] Register for the Data Engineering for AI/ML Conference now!
[02:27] Current focus and updates
[04:43] 3D Embeddings Visualization Explained
[07:07] Question Embeddings vs Retrieval
[08:24] Using heat maps effectively
[10:30] User insights visualization RAG
[16:59] 3D Crash Simulation Analysis
[20:33] Simulation purpose clarification
[22:34] Evaluating test data use cases
[24:22] Real-world car testing
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
[00:00]
zhlédnutí: 144

Video

AI Testing Highlights // Special MLOps Podcast Episode
zhlédnutí 117Před 12 hodinami
MLOps for GenAI Applications // Special MLOps Podcast episode with Demetrios Brinkmann, Chief Happiness Engineer at MLOps Community. // Abstract Demetrios explores common themes in ML model testing with insights from Erica Greene (Yahoo News), Matar Haller (ActiveFence), Mohamed Elgendy (Kolena), and Catherine Nelson (Freelance Data Scientist). They discuss tiered test cases, functional testing...
MLSecOps is Fundamental to Robust AISPM // Sean Morgan // MLOps Podcast #257
zhlédnutí 173Před 14 hodinami
MLSecOps is Fundamental to Robust AI Security Posture Management (AISPM) // MLOps Podcast #257 with Sean Morgan, Chief Architect at Protect AI. // Abstract MLSecOps, which is the practice of integrating security practices into the AIML lifecycle (think infusing MLOps with DevSecOps practices), is a critical part of any team’s AI Security Posture Management. In this talk, we’ll discuss how to th...
Continuous Model Training // Nicolas Mauti // MLOps Podcast #255 clip
zhlédnutí 79Před 16 hodinami
BigQuery Feature Store // MLOps Podcast #255 with Nicolas Mauti, Lead MLOps at Malt. Nicolas kicks things off by discussing his unique morning habits and his preference for orange juice and mint syrup over coffee. We then dive into the technical core of the episode, where Nicolas explains the processes involved in continuously retraining models at Malt. He emphasizes the role of conditional che...
Ensuring Robust AI Architectures // Harcharan Kabbay // MLOps Podcast #256 clip
zhlédnutí 68Před 19 hodinami
MLOps for GenAI Applications // MLOps Podcast #256 with Harcharan Kabbay, Lead Machine Learning Engineer at World Wide Technology. Harcharan shared invaluable insights on ensuring robust and reliable MLOps architecture. He highlighted common security oversights in deploying machine learning models and emphasized the importance of thorough evaluation processes through development stages to mitig...
MLOps for GenAI Applications // Harcharan Kabbay // MLOps Podcast #256
zhlédnutí 416Před 21 hodinou
MLOps for GenAI Applications // MLOps Podcast #256 with Harcharan Kabbay, Lead Machine Learning Engineer at World Wide Technology. // Abstract The discussion begins with a brief overview of the Retrieval-Augmented Generation (RAG) framework, highlighting its significance in enhancing AI capabilities by combining retrieval mechanisms with generative models. The podcast further explores the integ...
Exploring Long Context Language Models // MLOps Reading Group August 2024
zhlédnutí 205Před dnem
Paper: Can Long-context Language Models Subsume Retrieval, RAG, SQL, and More? arxiv.org/abs/2406.13121 // Abstract We’re excited to share that we relaunched our #reading-group, diving deep into technical papers every second Thursday of the month, covering a range of MLOps topics. In our latest session, we explored the question: Can Long-Context Language Models (LCLMs) Replace Retrieval, RAG, a...
BigQuery Feature Store // Nicolas Mauti // MLOps Podcast #255
zhlédnutí 281Před dnem
BigQuery Feature Store // MLOps Podcast #255 with Nicolas Mauti, Lead MLOps at Malt. // Abstract Need a feature store for your AI/ML applications but overwhelmed by the multitude of options? Think again. In this talk, Nicolas shares how they solved this issue at Malt by leveraging the tools they already had in place. From ingestion to training, Nicolas provides insights on how to transform BigQ...
ROI and Tool Transition in MLOps // Andy McMahon // MLOps Podcast #254 clip
zhlédnutí 66Před 14 dny
Design and Development Principles for LLMOps // MLOps Podcast #254 with Andy McMahon, Director - Principal AI Engineer at Barclays Bank. A huge thank you to @SAS for their generous support! Andy provides a detailed look at the nuances of evaluating the return on investment (ROI) when considering the shift from legacy tools to more advanced solutions. Andy emphasizes the overall lifecycle costs,...
Exploring Markovian Models // Yuri Plotkin // MLOps Podcast #253 clip
zhlédnutí 67Před 14 dny
The Variational Book // MLOps Podcast #253 with Yuri Plotkin, an ML Scientist. A huge thank you to @SAS for their generous support! // Abstract Curiosity has been the underlying thread in Yuri's life and interests. With the explosion of Generative AI, Yuri was fascinated by the topic and decided he needed to learn more. Yuri pursued learning by reading, deriving, and understanding seminal paper...
Design and Development Principles for LLMOps // Andy McMahon // MLOps Podcast #254
zhlédnutí 537Před 14 dny
Design and Development Principles for LLMOps // MLOps Podcast #254 with Andy McMahon, Director - Principal AI Engineer at Barclays Bank. A huge thank you to @SAS for their generous support! // Abstract As we move from MLOps to LLMOps we need to double down on some fundamental software engineering practices, as well as augment and add to these with some new techniques. In this case, let's talk a...
Data Quality = Quality AI // Panel // AIQCON
zhlédnutí 198Před 14 dny
// Abstract Data is the foundation of AI. To ensure AI performs as expected, high-quality data is essential. In this panel discussion, Chad, Maria, Joe, and Pushkar hosted by Sam Partee will explore strategies for obtaining and maintaining high-quality data, as well as common pitfalls to avoid when using data for AI models. // Panelists - Samuel Partee: Principal Applied AI Engineer @ Redis - C...
The Secrets of Jailbreaking LLMs // Ron Heichman // MLOps Podcast #252 clip
zhlédnutí 124Před 21 dnem
Harnessing AI APIs for Safer, Accurate, & Reliable Applications // MLOps Podcast #252 with Ron Heichman, Machine Learning Engineer at SentinelOne. Ron delves into the curious concept of “jailbreaking” LLMs to perform tasks they're typically programmed to avoid, drawing an interesting parallel to the text-based RPGs of the '70s and '80s. Whether it's exploring simple hacks or understanding the d...
Challenges Make MLOps Worthwhile // Nikhil Suresh // MLOps Podcast clip #250
zhlédnutí 86Před 21 dnem
AI Operations Without Fundamental Engineering Discipline // MLOps Podcast #250 with Nikhil Suresh, Director @ Hermit Tech. Nikhil shares insights into the company's unique structure and culture, his personal preferences for Ceylon tea over coffee, and the universal truth that no project is ever easy-whether it's in IT, improv theater, or anything in between. We're also discussing the importance...
The Variational Book // Yuri Plotkin // MLOps Podcast #253
zhlédnutí 185Před 21 dnem
The Variational Book // MLOps Podcast #253 with Yuri Plotkin, an ML Scientist. A huge thank you to @SAS for their generous support! // Abstract Curiosity has been the underlying thread in Yuri's life and interests. With the explosion of Generative AI, Yuri was fascinated by the topic and decided he needed to learn more. Yuri pursued learning by reading, deriving, and understanding seminal paper...
Vision and Strategies for Attracting & Driving AI Talents in High Growth // Panel // AIQCON
zhlédnutí 126Před 21 dnem
Vision and Strategies for Attracting & Driving AI Talents in High Growth // Panel // AIQCON
Building Creative and Productive Teams // Eric Landry MLOps Podcast #249 clip
zhlédnutí 66Před 28 dny
Building Creative and Productive Teams // Eric Landry MLOps Podcast #249 clip
From Auction Papers to Stealth Assessment // Aniket Singh // MLOps Podcast #248 clip
zhlédnutí 74Před 28 dny
From Auction Papers to Stealth Assessment // Aniket Singh // MLOps Podcast #248 clip
Harnessing AI APIs for Safer, Accurate, & Reliable Applications // Ron Heichman //MLOps Podcast #252
zhlédnutí 392Před 28 dny
Harnessing AI APIs for Safer, Accurate, & Reliable Applications // Ron Heichman //MLOps Podcast #252
Balancing Speed and Safety // Panel // AIQCON
zhlédnutí 116Před měsícem
Balancing Speed and Safety // Panel // AIQCON
Agentic Workflow // Chinar Movsisyan // MLOps Podcast #251 clip
zhlédnutí 143Před měsícem
Agentic Workflow // Chinar Movsisyan // MLOps Podcast #251 clip
Reliable LLM Products, Fueled by Feedback // Chinar Movsisyan // MLOps Podcast #251
zhlédnutí 282Před měsícem
Reliable LLM Products, Fueled by Feedback // Chinar Movsisyan // MLOps Podcast #251
A Blueprint for Scalable & Reliable Enterprise AI/ML Systems // Panel // AIQCON
zhlédnutí 209Před měsícem
A Blueprint for Scalable & Reliable Enterprise AI/ML Systems // Panel // AIQCON
Mastering Software Engineering for Data Scientists // Catherine Nelson // MLOps podcast #245 clip
zhlédnutí 147Před měsícem
Mastering Software Engineering for Data Scientists // Catherine Nelson // MLOps podcast #245 clip
AI Training and Validation // Shaun Wei // MLOps Podcast #244 clip
zhlédnutí 92Před měsícem
AI Training and Validation // Shaun Wei // MLOps Podcast #244 clip
AI Operations Without Fundamental Engineering Discipline // Nikhil Suresh // MLOps Podcast #250
zhlédnutí 482Před měsícem
AI Operations Without Fundamental Engineering Discipline // Nikhil Suresh // MLOps Podcast #250
AI in Healthcare // Eric Landry // MLOps Podcast #249
zhlédnutí 291Před měsícem
AI in Healthcare // Eric Landry // MLOps Podcast #249
Generative AI Governance Emerging // Sophia Rowland // MLOps Podcast #247 clip
zhlédnutí 73Před měsícem
Generative AI Governance Emerging // Sophia Rowland // MLOps Podcast #247 clip
Evaluating the Effectiveness of Large Language Models // Aniket Singh // MLOps Podcast #248
zhlédnutí 383Před měsícem
Evaluating the Effectiveness of Large Language Models // Aniket Singh // MLOps Podcast #248
Extending AI: From Industry to Innovation // Sophia Rowland & David Weik // MLOps Podcast #247
zhlédnutí 195Před měsícem
Extending AI: From Industry to Innovation // Sophia Rowland & David Weik // MLOps Podcast #247

Komentáře

  • @ThomasRoderick-v5v
    @ThomasRoderick-v5v Před 3 dny

    4:00 - 4:36 is really describing output/UAT type testing -- something that is crucial!

    • @MLOps
      @MLOps Před 3 dny

      Yes. this is a common pattern you have seen before just going by a different name?

  • @walterppk1989
    @walterppk1989 Před 5 dny

    It'd be interesting to dive into the specifics of how one might test if a model's accuracy goes up on a test dataset or if its error rates go down. The high level stuff matters, but the implementation is not arbitrary either

    • @MLOps
      @MLOps Před 4 dny

      great point! we will have to make that video next!

  • @CYCheung-yz1jv
    @CYCheung-yz1jv Před 6 dny

    Very useful!

  • @Dhallmanish
    @Dhallmanish Před 8 dny

    Good to see you brother. Always on new things and making us inspire. Way to go..

  • @user-xc6gl4zj8g
    @user-xc6gl4zj8g Před 10 dny

    How about using Vertex AI Feature Store? It's based on BigQuery.

  • @walterppk1989
    @walterppk1989 Před 12 dny

    16:30 I agree with Nicolas. There is a difference between having historical features and versioned features

  • @rezamahmoudi163
    @rezamahmoudi163 Před 14 dny

    please share powerpoint/slide ?

  • @666WolfWere
    @666WolfWere Před 22 dny

  • @MrDiscofun
    @MrDiscofun Před 22 dny

    Can we get the slides somewhere?

  • @JossOrtan
    @JossOrtan Před měsícem

    Great video! The discussion on data privacy regulations and machine learning was insightful. Do you think new privacy laws will hinder the progress of machine learning?

  • @mintakan003
    @mintakan003 Před měsícem

    Truth spoken ..,

  • @thannon72
    @thannon72 Před měsícem

    It was sob obvious you were reading from a script. Why didn't you just get AI to do it? Terrible

  • @Donald-mo2oe
    @Donald-mo2oe Před měsícem

    That was an extreamly good episode. As somebody that's trying to pivot into this career space as a 45 year old current student. I have come to realize that I do not know enough about what fields exist much less what are viable career prospects compared to the hype. Add to the equation that my degree plan is not going to prepare me or make me hireable in itself it leaves me feeling like a man in a small boat riding huge swells in the dark. Thank you for the content. I do appreciate the effort and the grind. It is time I catch up on your catalog.

  • @trndsettrlabs
    @trndsettrlabs Před měsícem

    Good prediction, considering this is 100% possible now at least from the digital aspect with Vercel AI Generative UI combined with langchain.

  • @jonassteinberg3779
    @jonassteinberg3779 Před měsícem

    Not very relevant to the topic. Yes there were considerations, but mainly how to solve them with Mystic.

  • @rajeshwarig2738
    @rajeshwarig2738 Před měsícem

    such a wonderful and insightful session

  • @jonassteinberg3779
    @jonassteinberg3779 Před měsícem

    Pretty gold tbh. Love how he gets into some very specific ops recommendations.

  • @jonassteinberg3779
    @jonassteinberg3779 Před měsícem

    I appreciate how the speaker was grinning through the entire talk lol. Jokes aside this is a bleeding edge topic and one really not covered anywhere else.

  • @davidoh0905
    @davidoh0905 Před měsícem

    how do I poo poo it lol what does it mean

  • @mohammedmujtabaahmed490
    @mohammedmujtabaahmed490 Před měsícem

    Nce vedeo

  • @jonassteinberg3779
    @jonassteinberg3779 Před měsícem

    One of about two talks on the webs that actually addresses these topics. Wonderful. Wish it was an hour.

  • @Donald-mo2oe
    @Donald-mo2oe Před měsícem

    My god she drinks what with what? That woman's hard as nails.

  • @MarkyGoldstein
    @MarkyGoldstein Před měsícem

    Please study acoustics

  • @NiranjanAnandam
    @NiranjanAnandam Před 2 měsíci

    I don't see it's something new that is being built in Uber for sure

  • @walterppk1989
    @walterppk1989 Před 2 měsíci

    Really interesting topic, not the best episode. Unlikely most of the time, I don't feel like I learned a lot.

    • @MLOps
      @MLOps Před 2 měsíci

      thanks for the feedback. what more would you have liked to see?

  • @md.arifhossain463
    @md.arifhossain463 Před 2 měsíci

    I was analyzing your CZcams channel. Your video content is very nice. Your video does not have good rank tag and the video does not have good SEO. For this reason, your video is not being ranked and viewed. Below we found your channel issue: * Your video doesn't have a good Top Ranking Hashtags, Keywords with the no right Meta tag * Your video doesn't have good SEO & Optimize.

  • @felipe.veloso
    @felipe.veloso Před 2 měsíci

    :) this is great!

  • @TheAliss77777
    @TheAliss77777 Před 2 měsíci

    Start with why! Timeless advice, indeed

  • @SnipeSniperNEW
    @SnipeSniperNEW Před 2 měsíci

    just bought the book, really great summarization of LLMs challenges, thank you for doing this

  • @shailshah4194
    @shailshah4194 Před 2 měsíci

    I saw Linus lees talk on the AI Engineer summit about embeddings, the demo was very cool, I just came across this video as the AI Engineer World fare is going on, this was awesome too.

  • @gibbstyler5905
    @gibbstyler5905 Před 2 měsíci

    Fascinating stuff

  • @laehmo
    @laehmo Před 2 měsíci

    This is really interesting :) I am currently studying Data Science and wanted to ask with which methods we can grant/deny permission to certain (internal business) data based on the level of the employee, in an LLM-conversational application? Can we use guardrails for this as well? Thank you a lot in advance for your quick answer!

  • @aneeinaec
    @aneeinaec Před 2 měsíci

    Is that Ryan Gosling ❤

  • @bajawell
    @bajawell Před 2 měsíci

    def correct.. there's body language, tone, etc... llm is just too low bandwidth

  • @CarlosMatiasdelaTorre
    @CarlosMatiasdelaTorre Před 2 měsíci

    Demetrios, my man, I loved the format. Now I'm off to read the post 😅

    • @MLOps
      @MLOps Před 2 měsíci

      letsss gooooooo! i am going to start trying to do more of these! thanks for the support!

  • @zowsan8362
    @zowsan8362 Před 2 měsíci

    🎉

    • @MLOps
      @MLOps Před 2 měsíci

      whoooo!

  • @bishalkarki7722
    @bishalkarki7722 Před 2 měsíci

    😮😮

  • @sworuplamichhane1214
    @sworuplamichhane1214 Před 2 měsíci

    Very very helpful thanks 🙏

    • @MLOps
      @MLOps Před 2 měsíci

      Awesome! glad you liked it!

  • @sndrstpnv8419
    @sndrstpnv8419 Před 2 měsíci

    may you share link to this data set?

  • @dattran6096
    @dattran6096 Před 2 měsíci

    Like for the intro song

  • @matthewrice7590
    @matthewrice7590 Před 3 měsíci

    Thanks for the overview! Michelangelo seems like quite the feat of engineering. Major kudos to the engineers who designed and built this out. Would be awesome to get more insight into how they calculated the trade-off between costs associated with the long-term development/maintenance/management of a custom system like this versus using something off the shelf and fully managed like Sagemaker/VertexAI/etc. Obviously you are going to be paying a premium for something like Sagemaker, but I can't help but be skeptical that going with a custom approach like this would pay-off in the long term considering the significant engineering effort that must go into ongoing development and refinement, especially when considering the immense complexity of a system like this. Maybe I'm just not thinking big enough haha

    • @CarlosMatiasdelaTorre
      @CarlosMatiasdelaTorre Před 2 měsíci

      For somewhat big companies building an internal dev platform makes a lot of sense to avoid vendor lock-in, ensure long term support, abstract services, ensure compliance across geographies, improve security and cost management, etc. For smaller companies it may not be the same.

    • @matthewrice7590
      @matthewrice7590 Před 2 měsíci

      @@CarlosMatiasdelaTorre good points!

  • @billykotsos4642
    @billykotsos4642 Před 3 měsíci

    This sounds like a proper LLM optimization software. Langchain is just something that is stitched together in a very brittle way... Whereas DSP looks to ground itself on very concrete foundations!

  • @kylebarone1710
    @kylebarone1710 Před 3 měsíci

    Any related GitHub links to this? And any thoughts on if use of the python vLLM library for paged attention would be able to get to place of 1M requests a day for 15/48$

  • @PacoNathan
    @PacoNathan Před 3 měsíci

    Loved this!!

  • @voncolborn9437
    @voncolborn9437 Před 3 měsíci

    Where can I find Linus' previous talk from a year ago? Would you post a link, please? Stop looking. I found it: czcams.com/video/rd-J3hmycQs/video.htmlsi=g6uWd96bxXuqgWXF

  • @ramyaram
    @ramyaram Před 3 měsíci

    Very insightful!!

  • @pranav7471
    @pranav7471 Před 3 měsíci

    I think for the text you meant to use a BERT model rather than a CNN based model

  • @MrAcenit
    @MrAcenit Před 3 měsíci

    Great presentation!

  • @baplkkk
    @baplkkk Před 3 měsíci

    Brilliant talk

  • @irrelevantdata
    @irrelevantdata Před 3 měsíci

    Are "question", "answer", "long_document" and "summary" variables, and if so, where are they declared? 5:17 What operation does this symbol "->" perform?