Writing Better Code with Ollama
Vložit
- čas přidán 25. 01. 2024
- Copilot changed everything for developers around the world. Then they started charging for it. And it won't work offline. And there is the security and privacy thing. Well, you can have the functionality of Copilot without all the headaches.
Be sure to sign up to my monthly newsletter at technovangelist.com/newsletter
And if interested in supporting me, sign up for my patreon at / technovangelist - Věda a technologie
Great content! So much here and didn’t even feel rushed in the short amount of time to cover all this.
Hi Matt. I love Ollama and you just make me love it even more. I look forward to your videos. They are always concise, informative and intelligent. Thank you for your work.
Thanks so much for saying that. Let me know if there are any topics you would like to see.
I was already appreciating you when you explained how to code with llama, and you had already said you were into building cool things, but when you said the joke about being in stuck in coder rather than having to put up with the nature of puget sounds, I subscribed.
Great video. I’ve found those two extensions to be the best as well. The small, fast model for the autocomplete. The bigger better model for Continue. Deepseek for both, but I havent tried Codellama. Complete game changer for offline coding!
Matt you are spoiling us with the amount of uploads, keep it up but remember to rest its the weekend.
I wish I could go faster.
I’m not planning to slow down for a few months
Really good stuff Matt. I'm excited about the Ollama python stuff. Good to know we've got some coding support as you highlighted. Cheers.
Thanks so much for the comment. Let me know if you have any ideas for content to cover on here.
There is a tremendous ammount of work Ollama team is doing
Thanks a lot 😊
I'm subscribed really awesome high quality video with good demo
Terrific. Thanks for contributing so much to the community!
Thanks. Making these is a lot of fun
I tested pre-Release version of Continue extension for VS Code with Ollama and set deep seek as the model.
Amazing! I can’t believe I can use such powerful AI autocomplete in my VSCode for free…
For Free! And it works so well
Hi Matt, ollama is running slower than ml studio with full gpu offload, is there a way to configure this with ollama as well? Thanks for the great content
Love it ! :D its like I am on the island with you man! I live in Sammamish
Nice. I used to work over there when I was at Microsoft. I was a sales engineer and my sales guy was based there. That’s when I was living in Bellevue.
Great stuff, Matt. I'm going to try this on my machine. So far I've been using ChatGPT 4 as my coding companion because my results with CodeLllama (running in Ollama) haven't produced code as good as that coming out of ChatGPT, and it's slower on my 16GB M1 MBP. However, I'd like to play around with different models and see how they do. Cheers.
using codellama at the cli isn't always convenient. But getting llama coder to suggest for me is pretty great.
Matt: great video and very timely. I'm tasked with writing a web Dashboard to talk with Ollama and Ollam3 LLM. I have it downloaded as Windows and talking to me. I also downloaded it under Ubuntu. Also working there. Using VS Code, I installed the Llama Code addon. I see it in the lower right. However, I write import Ollama from "ollama" and get a message "Cannot find module "ollama" or it's corresponding type declarations. Getting the environment for development and tools setup and working is probably a bigger obstacle than learning how to use the API. Any advice on the missing module. Thanks
Sound unrelated to the plugin. This is just that you haven’t added ollama to the environment
Hi. I actually play around with Ollama in vs code. So I have a question. What is the Llama coder extension for? I installed it, but couldn't figure out what it does our how to use it. Maybe i configured something wrong? But from the documentation it's also not clear how to proberly use it. So now I don't know if I does something wrong 😢
intriguing and useful! Subscribed.
Awesome. If you have anything you would like to see in the future, let me know.
Great video Matt! After an ollama upgrade, I had Continue integration without issue. What config is required for code suggestion/completion? And would the process be different for python code completion (as opposed to ts/js you demonstrated)?
I don’t think there is any diff with python or rust or c# or anything else. But continue isn’t for tab completion, only the chat interface… I think.
I'd love to hear any recommendations for AI Coders that can make UI mockups and iterate on UI mockups.
That would be interesting
Greetings from Victoria!
Hello there! Victoria is beautiful. We were there a year ago or so to attend a wedding at .... Hatley Castle in Colwood. Stayed at the Empress and it was amazing... always wanted to stay there after going to the Indian buffet that has been closed for years.
2:50: Where can I find it for vscodium?
Thank you, Matt! I've been looking for an extension just like this. I looked at Cody, but it uses LM Studio to interface with the LLM, and I haven't messed with LM Studio yet (Linux guy, and that product is a version behind and in beta for Linux.)
Actually it works with ollama too since about 2 weeks after we started building ollama.
@@technovangelist Thanks! I didn't know.
Perhaps a video on hooking up cody with ollama would be popular. I'd sure be interested.
Yup. That’s definitely one to do. Thanks.
Can you make a video on Egpu settings and using Ollama to use it, as a perfered set up?
It would be interesting to see what AI and machine learning could come up with as far as Neuro-training and neuro-therapy protocols.
Can you tell me more about that? What does that mean?
@@technovangelist it’s a behaviorist approach to mental health, basically a psychoanalyst does an initial assessment then a technician uses neurofeedback equipment to generate a baseline on the client’s brain function, then the psychoanalyst uses the baseline combined with the initial consult to identify issues and develop a course of treatment which involves using the neurofeedback to interact with a program that will help restore the problem areas of the brain to some degree of normality. All this is based on the theory that mental illness is a function of neurological deregulation, which is just a fancy way of saying that for any number of reasons the sundry Brodmann Areas of the client’s brain are out of sync and not communicating properly and need help getting back into sync.
Interesting. That's a world I know nothing about. Thanks so much for filling me in.
@@technovangelist yea after I asked the question I realized it was probably a little too niche I sometimes forget not everyone has read the same books as me.
Brilliant!
Thanks for the comment. Glad you enjoyed the video.
"Continue" can't use ollama hosted on my another laptop, while LLama code can
I suppose it can, but I'll need to dig deeply into settings
Cool! thank you
Thanks for giving it a watch. And for leaving a comment
how do I do it if Ollama is on my LAN?
hey, cool video! could you maybe do a video about mixtral8x7b?
I'd love to seereal metrics - speed improvement, code quality, etc? Also, am I using up 4g for each application/plugin that uses the same model?
Haven't seen any metrics that accurately reflect stuff like that. Can you point to anything? I don't know about other tools, but if using Ollama, then all the tools would share the memory.
@@technovangelist I meant disk storage for each application's version of the model data. E.g I just downloaded the codellama params from Meta. Will a VSCode plugin use those parameters or download another set. If I have two plugins will each have its own copy of model parameters.
Regarding sharing memory, I don't understand how this would be possible for two applications to share memory unless there were something like a model server protocol that loaded models and applications communicated with a server.
both apps are making a connection to ollama. Ollama is actually running the model. so disk and memory would be shared
@@technovangelist Thanks. I've finally got a couple models downloaded. Stuck getting torchrun example in README to work. Will look for a setup tutorial. Thanks again.
torchrun?? whats that? whose readme are you talking about. Sounds like a python thing. Ollama doesn't use that, apart from in their python library.
Cody is not local, if I remember it has a free plan with a very limited number of requests/month to the endpoint.
Yeah. Quinn mentioned that they still tokenize and cache on their services even if the model is local.
Before running the JavaScript code you need to run $ ollama run though, right?
no. the service is already running. ollama run is just for starting the command line ui for interactive use.
@technovangelist When i just run the JS code i get this error:
TypeError: fetch failed
at Object.fetch (node:internal/deps/undici/undici:11730:11)
at process.processTicksAndRejections (node:internal/process/task_queues:95:5) {
cause: Error: connect ECONNREFUSED 127.0.0.1:11434
can you tell me more about what you are running?
great content, which is the font used in vs code?
Hmmm I think I set it to jet brains ages ago
thats fantastic! I am curious on availability which languages could it autofill for ?
I’m not sure what the full list is. It’s most of the popular langs
@@technovangelist Golang?
Golang is definitely among the most popular. I would expect to see go, Java, js, ts, python, rust, I plan to do a video on this.
looking forward to it :)
@@technovangelist
Cool. I'll have to look for something similar for pycharm.
I feel like continue worked with jetbrains. Is that who makes pycharm?
@@technovangelist Yep. Thanks - Continue looks good though it gets bad reviews in pycharm so i'll have to see, codeGPT seems to be another option.
I tried llama coder not worked
Continue worked. Others are flaky at best. Thanks for videos i will look more extensions and models for my need.
If you have any idea that's great
Q. Llama coder always said model not available but it is their and continue can use it and respond me back
sorry you had problems. Have you asked in the ollama discord? discord.gg/ollama.
What are the hardware requirements?
Any machine that runs ollama should be fine. Or are you asking what’s needed to run ollama? For Mac it’s best on apple silicon. On windows or Linux it’s best for now with an nvidia gpu and soon with amd gpu
@@technovangelist , yes, I figured it would need a gpu; otherwise, it would be ridiculously slow.
LLamacoder vs aider. What is better?
Better???? Hard to say. For me, personally, I think there is no question that llama coder is better. Aider seemed hard to get started with and a bit kludgy. In my opinion. Once they are installed and just looking at what they do rather than the UI, they are identical. All of the tools in this space are basically identical. The models do the hard work here.
How can one contact you for consulting engagements
There is probably an email on this account somewhere. Or you can dm me on the discord. Discord.gg/ollama. I don’t think I want to do that but open to a conversation
my system cant handle it and it would just crash
these things require minimum 64 gb ram to work near to that of gpt 3.5. Not worth it below, also requires good vrams around 16 to 24 gb which are around damn expensive too. better use gpt 3.5 or pay for gpt 4. Bard sucks at coding damn hard and nowhere gives responses as nice as gpt.
Most of the models require 16GB to work well though some of the more exciting models are great in as low as 4-8 GB. Deepseek coder 1.6 is pretty amazing and will definitely run on low end hardware.
Mac or WIndows??.. You forget Linux shortcut. Sir.
Im on Mac, Ollama works on Mac and Linux and for now Windows with WSL2
I have already paid copilot for 1 year….
I remember that problem happened a lot with new relic. Folks would see Datadog, want to switch over because it was multiple orders of magnitude cheaper than anything else on the market but they already signed multi year contracts with new relic and were stuck.
wft why white theme
Because it’s better. And easier on eyes
Windows users cry in the corner.
Why? Ollama has worked on windows for months?
@@technovangelistwhen I visited the ollama website it says "available for macOS & Linux Windows coming soon"
Just use it via WSL 2
@@technovangelist It says for download that windows is "coming soon".
Running this under WSL2 is dog-slow, with about a 0.5 token per second on a $3000 gaming laptop.
If you don’t have a gpu it will be slow. Native windows app won’t be able to go any faster.