MLOps World: Machine Learning in Production
MLOps World: Machine Learning in Production
  • 360
  • 66 233
Making Enterprise GenAI Safe and Effective - Tools and Approaches
Speakers:
Rahm Hafiz, CTO, AutoAlign AI
Dan Adamson, Interim Chief Executive Officer and Co-Founder, AutoAlign AI
AutoAlign CTO Rahm Hafiz will show how different approaches (finetuning, moderation guardrails, and sidecars) can be used to deploy AI safely. Rahm will show setting up a sidecar and showing how it can be used as an automated guardrail system that dynamically interacts with LLMs to make them safe, effective, and compliant without losing efficacy and without having to retune every time your LLM changes.
zhlédnutí: 60

Video

Running prompts at CI does not make your GenAI app enterprise ready
zhlédnutí 121Před 2 měsíci
Speaker: Jakob Frick, CTO, Radiant AI
The BEST component for your RAG system
zhlédnutí 328Před 2 měsíci
Speaker: Jeffrey Kim, AutoRAG Lead Dev, Markr Inc. In this session, I will talk about the importance of optimization of the RAG system. And tell you how to use AutoRAG to automatically optimize the RAG system for your data briefly. It will lead you to boost RAG performance quickly and easily. There are many RAG pipelines and modules out there, but you don’t know what pipeline is great for “your...
Why AI apps don't work in prod: AI Reliability Survey
zhlédnutí 60Před 2 měsíci
Speaker: Shreya Rajpal, CEO, Guardrails AI Despite the initial frenzy around the impact of AI on software projects, the actualized impact remains limited. This is in large part because AI has inherent variability which makes engineering orgs stumped with the dreaded question "how do I know it won't break in prod even though it works in dev". In this talk, Shreya will cover why reliability for A...
What It Actually Takes to Deploy GenAI Applications to Enterprises Custom Evaluation Models
zhlédnutí 95Před 2 měsíci
Speaker: Alexander Kvamme, CEO, Echo AI Arjun Bansal, CEO & Co-founder, Log10 Alexander Kvamme and Arjun Bansal will share Echo AI's journey in deploying their conversational intelligence platform to billion-dollar retail brands. They will discuss the challenges faced due to LLM accuracy issues, which impacted their ability to deploy at scale. The speakers will speak about the iterative prompt ...
Lessons learned from scaling large language models in production
zhlédnutí 108Před 2 měsíci
Speaker: Matt Squire, CTO, Fuzzy Labs Open source models have made running your own LLM accessible many people. It's pretty straightforward to set up a model like Mistral, with a vector database, and build your own RAG application. But making it scale to high traffic demands is another story. LLM inference itself is slow, and GPUs are expensive, so we can't simply throw hardware at the problem....
From Idea to Production: AI Infra for Scaling LLM Apps
zhlédnutí 163Před 2 měsíci
Speaker: Guy Eshet, Product manager, Qwak AI applications have to adapt to new models, more stakeholders and complex workflows that are difficult to debug. Add prompt management, data pipelines, RAG, cost optimization, and GPU availability into the mix, and you're in for a ride. How do you smoothly bring LLM applications from Beta to Production? What AI infrastructure is required? Join Guy in t...
LLM Fine-Tuning for Modern AI Teams: How One E-Commerce Unicorn Cut Inference Cost by 90%
zhlédnutí 74Před 2 měsíci
Speaker: Emmanuel Turlay, CEO/Founder, Airtrain AI While commercial LLMs such as GPT-4 and Claude 3 Opus offer amazing generative quality, small open-source fine-tuned models such as Mistral 7B and Phi-2/3 can offer similar performance on specific tasks, for a fraction of the cost, and with much more control. However, this has been proven to be true only when the tuning dataset is of high quali...
Function Calling for LLMs: RAG without a Vector Database
zhlédnutí 188Před 2 měsíci
Speaker: Jim Dowling, CEO, Hopsworks In this talk, we will look at extending RAG with Function Calling to access structured/tabular data. We will look at how to enrich your tables with metadata and the expressivity of the queries that you can reasonably expect to perform well. We will examine function calling in the context of queries to the Hopsworks feature store, that supports extensive meta...
Finding training inefficiencies with CentML DeepView
zhlédnutí 20Před 2 měsíci
Speaker: Yubo Gao, Research Software Development Engineer at CentML Inc, and PhD student at University of Toronto, CentML Inc. Performance bottlenecks and resource underutilization is a common occurrence to deep learning researchers and developers. They slow down workflows of ML developers and waste computational resources. The current ecosystems of DL profilers do not provide a developer-frien...
Evaluating LLMs and RAG Pipelines at Scale
zhlédnutí 314Před 2 měsíci
Speakers: Eric O. Korman, Cofounder / Chief Science Officer, Striveworks Large Language Models (LLMs) and their applications, such as Retrieval-Augmented Generation (RAG) pipelines, present unique evaluation challenges due to the often unstructured nature of their outputs. These challenges are compounded by the variety of moving parts and parameters involved, such as the choice of underlying LL...
Empowering Data Science Teams: Harnessing AI with Appen
zhlédnutí 28Před 2 měsíci
Speakers: Sasha McGrath, Account Executive, Appen Geoff LaPorte, Adoption Program Manager, Applied AI, Appen In an era driven by data and powered by artificial intelligence, the effectiveness of data science teams hinges upon access to high-quality data and robust collaboration tools. Our presentation unveils a comprehensive platform designed to revolutionize how data science projects are execu...
Better Chatbots with Advanced RAG Techniques
zhlédnutí 234Před 2 měsíci
Speaker: Zain Hasan, Developer Advocate, Weaviate Chatbots are becoming increasingly popular for interacting with users, providing information, entertainment, and assistance. However, building chatbots that can handle diverse and complex user queries is still a challenging task. One of the main difficulties is finding relevant and reliable information from large and noisy data sources. In this ...
Enhance Cost Efficiency in Domain Adaptation with PruneMe
zhlédnutí 56Před 2 měsíci
Speaker: Shamane Siri, Ph.D. , Head of Applied NLP Research, Arcee.ai Our PruneMe repository, inspired by "The Unreasonable Ineffectiveness of the Deeper Layers," demonstrates a layer pruning technique for Large Language Models (LLMs) that enhances cost efficiency in domain adaptation. By removing redundant layers, we facilitate continual pre-training on streamlined models. Subsequently, these ...
Data Versioning in Generative AI: A Pathway to Cost-effective ML
zhlédnutí 35Před 2 měsíci
Speaker: Dmitry Petrov, CEO, DVC For 5 years we have been building DVC and we know how data versioning helps teams. The evolving Generative AI workflows are different and require an evolution of versioning workflows to accomplish Generative AI goals. This new era thrives on vast amounts of unstructured data, which include everything from images, videos, and audio, to MRI scans, document scans, ...
Building ML and GenAI Systems with Metaflow
zhlédnutí 103Před 2 měsíci
Building ML and GenAI Systems with Metaflow
Efficiently Fine-Tune And Serve Your Own LLMs
zhlédnutí 92Před 2 měsíci
Efficiently Fine-Tune And Serve Your Own LLMs
The Who, What, and Why of Data Lake Table Formats
zhlédnutí 49Před 2 měsíci
The Who, What, and Why of Data Lake Table Formats
Private, Local AI
zhlédnutí 156Před 2 měsíci
Private, Local AI
The Journey of Building a Leading Open Source LLM Security Toolkit
zhlédnutí 98Před 2 měsíci
The Journey of Building a Leading Open Source LLM Security Toolkit
The Secret Sauce for Deploying LLM Applications into Production
zhlédnutí 80Před 2 měsíci
The Secret Sauce for Deploying LLM Applications into Production
Running Multiple Models on the Same GPU, on Spot Instances
zhlédnutí 134Před 2 měsíci
Running Multiple Models on the Same GPU, on Spot Instances
Towards Robust GenAI: Techniques for Evaluating Enterprise LLM Applications
zhlédnutí 79Před 2 měsíci
Towards Robust GenAI: Techniques for Evaluating Enterprise LLM Applications
Introducing Arize-Phoenix and OpenInference
zhlédnutí 342Před 2 měsíci
Introducing Arize-Phoenix and OpenInference
Mitigating RAG Hallucinations with Aporia Guardrails
zhlédnutí 91Před 2 měsíci
Mitigating RAG Hallucinations with Aporia Guardrails
LLMs From Dream to Deployed
zhlédnutí 28Před 2 měsíci
LLMs From Dream to Deployed
Evaluation Engineering: Iterative Strategies to Testing Prompts
zhlédnutí 162Před 2 měsíci
Evaluation Engineering: Iterative Strategies to Testing Prompts
Customizable RAG Workflows with your Own Data
zhlédnutí 100Před 2 měsíci
Customizable RAG Workflows with your Own Data
Wanted: A Silver Bullet MLOps Solution for Enterprise
zhlédnutí 106Před 2 měsíci
Wanted: A Silver Bullet MLOps Solution for Enterprise
Evaluation Techniques for Large Language Models
zhlédnutí 137Před 2 měsíci
Evaluation Techniques for Large Language Models