Nvidia Cuda, cuDNN, Conda, PyTorch and TensorFlow Installation with Ubuntu 22.04

650 AI Lab

zhlédnutí 44 253

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 24. 07. 2024
This video is all you need to get your Ubuntu 22.04 Deep Learning machine ready with the following:
1. Ubuntu Kernel 5.18 Update
2. Latest Nvidia Display Driver 515.57
3. Cuda Toolkit 11.7
4. cuDNN 8.0 Installation
5. Conda Toolkit 11.7
6. Python 3.9
7. Torch with GPU Support
8. TensorFlow with GPU support
GitHub Resources:
github.com/prodramp/DeepWorks...
▬▬▬▬▬▬ ⏰ TUTORIAL TIME STAMPS ⏰ ▬▬▬▬▬▬
- (00:00) Quick Intro
- (01:32) Ubuntu Kernel 5.18 Update
- (02:30) Nvidia Driver update 515.57
- (03:05) Driver install in Recovery Mode
- (04:40) Cuda Toolkit 11.7 Installation
- (05:24) Tools nvcc, gcc, g++, cmake check
- (06:06) cudNN 8.x instalation
- (09:32) Conda Cuda Toolkit 11.7 Installation
- (10:22) Python 3.9 and Torch test with GPU
- (10:45) TensorFlow Installation with GPU
- (11:15) Final installation validation
Connect
------------------
- Prodramp LLC (@prodramp)
- Website - prodramp.com
- LinkedIn - / prodramp
- GitHub- github.com/prodramp/
- AngelList - angel.co/company/prodramp
- Facebook - / prodramp
Content Creator: Avkash Chauhan (@avkashchauhan)
- / avkashchauhan
- / avkashchauhan
Tags:
#nvidia #ai #deeplearning #cnn #ml #lime #aicloud #h2oai #driverlessai #machinelearning #cloud #mlops #model #collaboration #deeplearning #modelserving #modeldeployment #pytorch #datarobot #datahub #streamlit #modeltesting #codeartifact #dataartifact #modelartifact #onnx #aws #kaggle #mapbox #lightgbm #xgboost #dataengineering #pandas #keras #tensorflow #tensorboard #cnn #prodramp #avkashchauhan #LIME #mli #xai #cuda #cuda-nn
Věda a technologie

Komentáře • 71

@linuxbrad Před rokem ⁺²
Thank you! I wouldn't have even known what questions to ask, but you have enumerated the process quite clearly. Keep up the good work!
@650AILab Před rokem
Glad it was helpful! Thank you so much for your feedback.
@SpaceExplorer Před rokem ⁺⁵
thank you. I have hated how difficult this process has been I hope this video works!
@650AILab Před rokem
Appreciate your comment. Thanks you so much. It does work as followed by several users.
@periyasamyshyamsundar8685 Před 2 lety ⁺¹
Hi Prodramp, thanks for the wonderful tutorial.
@650AILab Před 2 lety
Appreciate your comment and glad to be an assistance.
@oscarllerena2980 Před rokem ⁺⁴
I think it is important to quote that in the moment of producing this video the newer kernel was the one that the author is updating to, which is kernel 5.18. Because some new fellows might think that they have to downgrade their kernel to 5.18 when it is not needed.
@alamnoor8668 Před 2 lety ⁺¹
Thanks for the wonderful tutorial.
@user-yd5ze2cg8v Před 7 měsíci
Great video. Could you please let us know how to set up such an environment while using ubuntu on a mackboo pro 2013 with intel ?
@tamizhelakkiya Před 6 měsíci
hi, it is very useful..is it mandatory to install anaconda in the base and cuda toolkit in the new environment(in your video it is in dl39).
@doublesami Před 2 měsíci
I have few questions becuase i want to install cuda tool kit 11.7 and pytorch 1.3.x with cuda 11.7:
1 : can I installed cuda tool kit 11.7 with latest version of nvidia drivers 535 in ubuntu 20?
2: for cuda too kit installation , you have installed cuda toolkit twice , one by downloading from nvidia website and one by running command for conda , is it compulsory to install conda based toolkit as well?
@DarKayserLeo Před rokem
is it always recommended to install the lastest Nvidia drivers? In my case I want to install cudatoolkit 11.3. is there any incompatibility?
@gpligor Před rokem
Sorry but at the end of the video near ~12:00 the output seems that no GPUs are found! why is that ?
@manfredkremer7105 Před 10 měsíci
I have problems with this procedure. Already at the very beginning at updating the kernel there is a mistake: instead of .deb it must be *.deb This I finally figured out. When I try to install the NVIDIA driver in the recovery mode, the installation is terminated, because it needs cc, but in my system (i specially prepared a virgin system to do the procedure) cc is not found. This makes it difficult to follow your instructions.
@sweeterror404 Před rokem ⁺¹
i have 5.19.0-42-generic Kernal ?
@skumarreddymallidi915 Před 8 měsíci
I am trying to setup a small station with 2 rtx 3060 GPUs, but not able to. Can you pls guide me.
@themysteriousindian8694 Před 2 lety ⁺⁷
manual installation for cuda is a bit hard for maintaining i recommend using cuda containers by nvidia using docker once that's configured there is no issue as the gcc issues happens with other packages docker can tackle this problem
@homerlol9058 Před rokem
how can I do this?
do you have any tutorial I could follow?
@themysteriousindian8694 Před rokem
@@homerlol9058 You need to use docker to acheive this
@teddethmack Před 2 lety ⁺¹
Thank you for this - very useful! Just wondering whether you had a solution for jax not finding the GPU?
@650AILab Před 2 lety ⁺¹
Appreciate your comment, thank you so much.
Yes, please check this out czcams.com/video/auksaSl8jlM/video.html
@Himakarbavikaty Před rokem ⁺¹
Hi Prodramp,
Thanks for your tutorial.
I did as you thought but tensorflow and pytorch are not recognizing GPU. i am able to get GPU with nvidia-smi. Can you please advise?
@650AILab Před rokem
You have to run a multistep inspection. first only stick with pytorch and check why not GPU detected and then follow for the TF. its hard to give u steps here, sorry.
@Nomolosos89 Před rokem ⁺¹
Hi thank you for the tutorial. I have a question, during the driver install, I had a request for “install sign kernel” and things didn’t work out. I tried to install it but got an error because secure boot is enabled. Should I disable it? And how should I do that?
@650AILab Před rokem
You can go back to start the kernel at root level and install the driver at root mode to avoid the error. If you trust the driver, should be okay. Its preference at the user level and the need of the driver.
Thanks for the comment, appreciate it.
@shitongmao5265 Před rokem ⁺¹
Thank you, Prodramp for this helpful tutorial. However, from 3:05 to 4:39, I have no idea what you are talking about. I am a machine learning Ph.D. and just bought a PC for my own projects. I just installed ubuntu 22.04, and I am trying to set up the environment. Sorry that I have learned nothing about the 'start mode' , 'recovery mode', or 'user prompt', would you please explain more about the procedures? really appreciate!!
@650AILab Před rokem ⁺²
Let me explain you what is going on. When you have display driver installed, you just can not overwrite, installing it, will give u error unless the driver installer has a protection built into to the installer to continue installation after restart. What I have done is, started the Ubuntu machine into the recovery mode. In this mode only linux kernel is loaded with few important drivers i.e. disk, network etc.. At this time the display driver installation is very easy because the display driver is not loaded so there is no error or overwriting it. Every linux installation support both recovery mode as well as normal mode and Recovery mode is used to install drivers or fix various errors which can not be fixed in regular mode. you would need to learn these methods to be an effective Ubuntu user. Hope this clarifies your question(s).
@AnvABmai Před rokem ⁺¹
CUDA version in nvidia-smi output does not shows actually installed CUDA toolkit version, but show the latest suitable CUDA version for current driver. To check actual installed CUDA version please use nvcc --version command
@650AILab Před rokem ⁺¹
Appreciate your feedback. Thanks.
@AnvABmai Před rokem
@@650AILab you are welcome
@shubhamkulkarni2352 Před rokem ⁺¹
I followed your setps, installed drive in reboot successfully but still getting this error:
NVIDIA-SMI has failed because it couldn't communicate with the NVIDIA driver. Make sure that the latest NVIDIA driver is installed and running.
Please help
@650AILab Před rokem
Most of the driver problems are logged in the installation logs so if you please read the log you will get exact reason for your trouble. And if you share the error, I will be happy to give my feedback on the error to solve it.
Thanks for your comment and feedback, sincerely appreciate it.
@danielwulf4241 Před rokem ⁺¹
for MX Linux users installing cuda as deb package:
sudo add-apt-repository contrib
doesn't work out of the box, use instead:
sudo apt-get install software-properties-common
@650AILab Před rokem
Appreciate your comment, thank you so much for sharing this information, definitely will be useful for someone.
@kasichennupati8257 Před rokem ⁺¹
Hey
its a great video able to follow through the whole video and explained very well
small correction in the Ubuntu kernel Update 5.18 section
code to install all the .deb packages is
sudo dpkg -i *.deb
Also after installing the cuda
need to add the path to .bashrc
cd /home/$user/
nano .bashrc
add below
export PATH="/usr/local/cuda-11.7/bin:$PATH"
export LD_LIBRARY_PATH="/usr/local/cuda-11.7/lib64:$LD_LIBRARY_PATH"
now nvcc --version will show up
@650AILab Před rokem
I am glad, you enjoyed it and found it useful, thanks for the comment.
@dmaxdsbabo Před 20 dny
I thought the CUDA toolkit downloads the Driver automatically?
@MK-iz1gc Před 2 lety ⁺³
When I check whether GPU is acessable i have 10 answer that is not and last that GPU is fine. But it works and my network is learning using GPU.
@650AILab Před 2 lety
It's all depends on the Python library you are using and if that library has access to GPU. This video covers torch and TensorFlow support with GPU and my latest video shows the jax/jaxlib support with GPU.
@sir_no_name1478 Před rokem ⁺¹
Hey there I want to know if I need to update the Kernel in order to get everything working. Because the LTS supports only up to 5.17 and I worry that I break something. Also I wanted to know if I need to update the Kernel, if it makes sense to update it to 5.19 because that is what ubuntu 22.10 now uses
@650AILab Před rokem ⁺¹
Thanks for the comment, appreciate it also asking the question.
I will not jump to 20.10 unless there is definite need as well as kernel upgrade to 5.19.
As of now 5.18 is very stable kernel with 22.04 and it is LTS, which I am running on my machine so I do not have any need to upgrade both kernel and ubuntu release.
@krzysztofdymanowski8759 Před rokem ⁺¹
Hi, two questions:
1. Is the 5.18 kernel necessary?
2. When I try to install 5.18 kernel it breaks my machine, probably due to having very new hardware in my rig. Can I install an earlier kernel ( say 5.15 ) and then just keep going with the installation and everything will work fine?
@650AILab Před rokem ⁺¹
Yes, 5.15 kernel will work exactly the same. I had all working with the 5.15 first and later I upgraded to 5.18 and applied all my steps, there was no issues with both the kernels. All the very best.
@play_longterm Před 4 měsíci
@@650AILabThank you very much for the helpful video and your work! But I have 1 question here though. I am afraid that with the system update everything will break due to conflicts. You mentioned that the workstation worked for you with the kernel version 5.15. Then, you upgraded to the 5.18. But after that did you purge CUDA toolkit and cuDNN and reinstall it again?
@oscarllerena2980 Před rokem
Hi thanks for the tutorial, i have some questions:
1. how do you decide to make the whole process on kernel 5.18? Will it be the same for 5.19?
2. I have a nvidia gtx 3050 but when looking for the driver, I have two options one with Ti and another without Ti. The "Ti" is for titan?
@lollol-bh5uw Před rokem
use the ti version, the one without ti won't work, most likely.
@oscarllerena2980 Před rokem
@@lollol-bh5uw thanks. Do you know why is that?
@IanWoolf Před rokem ⁺¹
Every time I reboot after nvidia driver installation I get a "oh no something went wrong" screen. I tried to follow your directions, but dpkg of linux-modules won't install because the kernel isn't installed, and the kernel won't install because the linux-modules aren't installed.
@650AILab Před rokem
Thanks for your comment, appreciate it.
Please start the Ubuntu into the recovery mode with networking first and then install the packages directly from the comment line.
@IanWoolf Před rokem
@@650AILab Thanks for your quick reply! Unfortunately I can’t boot into recovery, it also goes to “oh no something went wrong”. I’m beginning to think reinstalling is my only way out.
@nettyyyys Před rokem ⁺¹
Hello, at the beggining all installed all but nothin shown. After restart the terminal nvidia-smi showed cuda but nvcc not. I solved that with:
check if it is in your PATH by “whereis nvcc”, if it returns “nvcc:” then you need to add below two lines in “.bashrc”
usually “.bashrc” file path is like “/home/username/.bashrc” then add below two lines (change cuda version with installed version)
export PATH=“/usr/local/cuda-11.4/bin:$PATH”
export LD_LIBRARY_PATH=“/usr/local/cuda-11.4/lib64:$LD_LIBRARY_PATH”
then save and close the file
check “nvcc --version”
Hope that it helps someone. I used it because NVIDIA-SMI sowed CUDA but NVCC --version not.
@650AILab Před rokem
Perfect, thanks for sharing your tip. Appreciate your comment.
When anyone uses "whereis" command it actually checks the binary from the path(s) and if it is not available in the path, you will not get it. So if you know you do have the binary, its best to add it into the path to make it accessible by the OS and all other tools.
@oscarllerena2980 Před rokem
sorry for asking ... how do you get that information output in the terminal at the left with the ubuntu logo and all the useful information
@oscarllerena2980 Před rokem
screenfetch
@JWAM Před 2 lety ⁺¹
Which version of pytorch did you install / build and how? That is the whole issue that I'm having with 22.04. The official pytorch releases support cuda 11.3 and 11.6 (My ubuntu has 11.5 and can be updated to 11.7... What are the odds..? ).
@650AILab Před 2 lety
I do not use python directly instead use Conda primarily to create python environment. With Conda you can install "conda install -c pytorch pytorch" and this will install the pytorch (pytorch/1.12.0/py3.9_cuda11.3_cudnn8.3.2_0/pytorch) for your conda based python 3.9 environment on Ubuntu 22.04.
@JWAM Před 2 lety ⁺¹
@@650AILab Yes, I was (am) using conda environments as well, but still had issues, so I thought that the cuda-version (cuda-toolkit) still needs to match up with the cuda-version of the system install (11.5 in my case). So I tried a bunch of things while failing. But while going down to PyTorch 1.11, I realized that something wasn't quite right with my NVIDIA-packages, I reinstalled those and at least PyTorch 1.11 with a non 11.5-cuda (11.3 I think it was) started working. Maybe PyTorch 1.12 will too, with whatever CUDA-toolkit-versions they are packaged with (assuming that the display drivers / card support that particular version)
@650AILab Před 2 lety
@@JWAM Please try with updating Cuda/Conda/Python from scratch as It worked for me. Hope for the positive results .
@sirkrisvandela9359 Před rokem
@@JWAM Did you find a solution?
@sirkrisvandela9359 Před rokem
found the solution: What did the trick for me was to first call conda install -c nvidia/label/cuda-11.7.0 cuda-toolkit and only then install pytorch (without cudatoolkit)
@sirkrisvandela9359 Před rokem ⁺¹
What did the trick for me was to first call *conda install -c nvidia/label/cuda-11.7.0 cuda-toolkit* and only then install pytorch
@650AILab Před rokem
Glad you got it working.
@yousefunfiltered Před rokem ⁺¹
clearly from the prompt you got from python after downloading tenserflow shows that it isn't supporting gpu
@yousefunfiltered Před rokem
same with torch
@650AILab Před rokem
I am not sure if I understood you correctly... Thanks for the comments appreciate it.
@der_fabs Před 2 lety
Hey there, thanks so much for your tutorial! When I try to install the Cuda Package, i get this error: The public CUDA GPG key does not appear to be installed.
To install the key, run this command:
sudo cp /var/cuda-repo-ubuntu2204-11-7-local/cuda-46B62B5F-keyring.gpg /usr/share/keyrings/
When I then try to run the command, nothing happens. Any idea on what it could be? Help would be very appreciated!
@650AILab Před 2 lety
When you will run the command there will be no output and after that if you run the next command it will work as expected. I am sure you are doing correctly.

Další v pořadí

Automatické přehrávání

AI Workshop: Build your own Text-to-Image application with DALL-E mini in Python from scratch