Video není dostupné.
Omlouváme se.

Ollama can run LLMs in parallel!

Sdílet
Vložit
  • čas přidán 18. 08. 2024
  • In this video, we're going to learn how to run LLMs in parallel on our local machine using Ollama version 0.1.33.
    #ollama #llms #llama3 #phi3
    Code - github.com/mne...
    Ollama 0.1.33 - github.com/oll...
    Blog post - www.markhneedh...

Komentáře • 26

  • @36mcast
    @36mcast Před 3 měsíci +3

    Fantastic video and thanks for sharing!
    A few notes for I already have ollama installed and I have to turn it off before running the commands.
    1. Turn off the ollama on ubuntu or mac ` systemctl stop ollama.service` before running `OLLAMA_NUM_PARALLEL=4 OLLAMA_MAX_LOADED_MODELS=4 ollama serve`
    2. Line 12 in your code will not work on some streamlit version as the API key is ` api_key="ignore-me"` but should be `api_key="ollama"` by default

    • @learndatawithmark
      @learndatawithmark  Před 3 měsíci +1

      Hey, thanks for your kind words. Regarding:
      1. Yes, you are right. I had manually killed it on my machine, but your way is better. And if you want those environment variables to persist you'd want to set them in your .bashrc file or similar.
      2. Do you mean on some openai library versions? I didn't realise it was used unless you were calling OpenAI itself, but I will use 'ollama' from now on!

    • @36mcast
      @36mcast Před 3 měsíci

      @@learndatawithmark for 2. This would be for any open source model for some reason, it did not work me. This might be related to the package/ollama versions. GenAi libraries are updated about 2x per week to point where what you see on the internet today does not work next week

  • @123arskas
    @123arskas Před 3 měsíci

    Thanks for the code and the information

  • @thomasrodermond6057
    @thomasrodermond6057 Před 3 měsíci

    Very good work. Thank you!

  • @goktugkoksal8643
    @goktugkoksal8643 Před měsícem +1

    Super

  • @SonGoku-pc7jl
    @SonGoku-pc7jl Před měsícem

    thanks!

  • @ajmalm1
    @ajmalm1 Před 2 měsíci +1

    Can you explain how to use this parallel capability with the ollama python library.

  • @tlubben972
    @tlubben972 Před 2 měsíci

    I agree , would be great if you can provide the code for doing this in python 😊

    • @learndatawithmark
      @learndatawithmark  Před 2 měsíci

      Do you mean in Python in general or specifically how I did it to build the UI I used here?

  • @123arskas
    @123arskas Před 3 měsíci

    Can you show the Parallelism of Ollama through LangChain too? Thank you

    • @learndatawithmark
      @learndatawithmark  Před 3 měsíci

      What would be a good LangChain example - showing how to call Ollama multiple times via LangChain? Is that what you had in mind?

    • @123arskas
      @123arskas Před 3 měsíci

      @@learndatawithmark
      Summarization of multiple CZcams Transcripts one by one. Lets say 100 of them.

  • @amirhosseinbayani5297
    @amirhosseinbayani5297 Před 10 dny

    Thanks for your Great video. Could you please help me how I can setup the same configuration on VSCode? I used os.environ["OLLAMA_NUM_PARALLEL"] = "32" os.environ["OLLAMA_MAX_LOADED_MODELS"] = "2" , but I could not load two models and parallelization did not work as well.

    • @learndatawithmark
      @learndatawithmark  Před 10 dny

      I think if you download the latest version of Ollama that's all gonna be automatically configured?
      By VS Code, do you mean you're using the terminal inside VS Code? I'm a bit confused why you're trying to set the parameters from your Python code - you need to have them set before you launch the Ollama Server

    • @amirhosseinbayani5297
      @amirhosseinbayani5297 Před 7 dny

      @@learndatawithmark Yes, I used Terminal inside VS Code. So you suggest that it is better to set the parameters on Ollama server? Could you please help me on setting them on Windows?

  • @modx5534
    @modx5534 Před 2 měsíci

    How does this work with the docker version of Ollama? Can someone please help me here?

    • @learndatawithmark
      @learndatawithmark  Před 2 měsíci +1

      I haven't tried with Docker yet, but I would think we can pass them in as env variables. Let me look more into it and I'll let you know.

    • @modx5534
      @modx5534 Před 2 měsíci

      @@learndatawithmark Thank you so much :D

  • @anilrajshinde7062
    @anilrajshinde7062 Před 2 měsíci

    Your all videos are great. Can you prepare few videos on LLM OS where we can use Ollama?

    • @learndatawithmark
      @learndatawithmark  Před 2 měsíci

      I haven't heard of LLM OS before - can you explain it a bit more?

    • @anilrajshinde7062
      @anilrajshinde7062 Před 2 měsíci

      @@learndatawithmark czcams.com/video/6g2KLvwHZlU/video.html

  • @karthikb.s.k.4486
    @karthikb.s.k.4486 Před 3 měsíci

    What IDE are you using please let me know.

    • @learndatawithmark
      @learndatawithmark  Před 3 měsíci

      The code is in Vim

    • @learndatawithmark
      @learndatawithmark  Před 3 měsíci

      But I'm only using Vim for the video as it lets me make the text really big. Usually I use VS Code