How to Deploy NVIDIA NIM in 5 Minutes

Sdílet
Vložit
  • čas přidán 10. 09. 2024
  • NVIDIA NIM is a set of microservices for deploying AI models. Tap into the latest AI foundation models-like Stable Diffusion, esmfold, and Llama 3-with downloadable NIM microservices for your application deployment.
    Join Neal Vaidya, developer advocate at NVIDIA, for a demo on how to quickly deploy NVIDIA NIM microservices, locally with Python or programmatically through Docker. This tutorial focuses on deploying Llama 3.
    0:22 - Overview of NIM microservices (nvda.ws/4bZLY9E)
    0:36 - Test the Llama 3 model on a web browser with a hosted API
    0:51 - Generate an API key and get sample code snippets
    0:59 - Test the Llama 3 model in a self-hosted environment
    1:08 - Get access to API catalog to begin self-hosted deployment
    1:22 - Pre-install Docker engine and Docker CLI tool
    1:50 - Authenticate your container
    1:55 - Generate an environment variable called NGC API key
    2:05 - Input a single Docker run command
    2:19 - Expose Docker to all GPUs to the running container
    2:28 - Expose the API environment variable
    2:35 - Mount the cache to download and store model weights
    2:48 - Specify the NIM should run as a local user
    2:53 - Expose the main port to interact with the running NIM
    3:03 - Add the model name to the image path
    3:30 - Confirm the service is ready in another terminal using curl
    3:41 - Send the container a new request
    Developer resources:
    ▫️ Learn more about NIM: nvda.ws/3yqsuNw
    ▫️ Join the NVIDIA Developer Program: nvda.ws/3OhiXfl
    ▫️ Access downloadable NIM microservices on the API catalog: nvda.ws/4bZLY9E
    ▫️ Read the Mastering LLM Techniques series to learn about inference optimization, LLM training, and more: resources.nvid...
    #inferencemicroservices #inferenceoptimization #api #selfhosting #modeldeployment #aimodel #LLM #generativeai #aimicroservices #nvidianim #generativeaideployment #aiinference #productiongenai #enterprisegenerativeai #acceleratedinference #nvidiaai #apicatalog

Komentáře • 9

  • @TristanVash38
    @TristanVash38 Před měsícem +1

    Awesome! Thanks, NVIDIA Team!

  • @infraia
    @infraia Před 14 dny

    Things are moving fast! Exciting times !

  • @cho7official55
    @cho7official55 Před měsícem +1

    Very nice tutorial, thanks a lot

  • @JayMatth
    @JayMatth Před měsícem

    Very nice! I will surely have a look at this :)

  • @MeownaMeow
    @MeownaMeow Před měsícem

    Nice, thanks 😁

  • @saitaro
    @saitaro Před měsícem +1

    Where is the notebook in the description?

    • @StiekemeHenk
      @StiekemeHenk Před měsícem

      Rip, I think it's this article? Build a RAG using a locally hosted NIM

  • @wyattx008
    @wyattx008 Před měsícem +1

    So... can you make me some money? 😇