Nvidia Tesla P100 vGPU Cloud Gaming Performance!
Vložit
- čas přidán 3. 04. 2024
- Thanks to Maximum Settings Cloud Gaming for sponsoring today's video. Get started with your very own Cloud Gaming VM or Bare Metal Machine at bit.ly/3wxfUuB!
Grab yourself a a set of BEER Floppy Disk Coasters at craftcomputing.store
I've tested what feels like dozens of Enterprise GPUs inside my Cloud Gaming Server, but Nvidia's Pascal series has been prohibitively expensive until very recently. The Tesla P100 has fallen from $1000 to just $165 in the last few months, which means it's time to see what it can do.
But first... What am I drinking???
Block 15 (Corvallis, OR) Charmed Life Irish Red Ale (5.0%)
Proxmox vGPU Installation Script: github.com/wvthoog/proxmox-vg...
PolloLoco's vGPU Installation Guide: gitlab.com/polloloco/vgpu-pro...
Manual vGPU Install Tutorial: • Proxmox GPU Virtualiza...
Links to items below may be affiliate links for which I may be compensated
Grab an Nvidia P100 16GB on eBay: ebay.us/OYEgUn
- Cloud Gaming Server -
• My Most EPYC Server Bu...
AMD Epyc 7742: amzn.to/3hYoKIe
AsRock RACK ROMED8-2T Motherboard (7002/7003 Support): amzn.to/3xdxoal
256GB (8x32GB) DDR4 ECC-REG 2666: amzn.to/2TxxVpA
be quiet! Dark Power 12 1500W 80+ Titanium: amzn.to/4aJQVnM
Asus Hyper m.2 x16 V2: amzn.to/3xdxtLb
Noctua NH-U9 TR4-SP3: amzn.to/3eRAhHt
InWin R400N 4U Server Chassis: amzn.to/3BFYUjQ
Follow me on Mastodon @Craftcomputing@hostux.social
Support me on Patreon and get access to my exclusive Discord server. Chat with myself and the other hosts on Talking Heads all week long.
/ craftcomputing - Věda a technologie
"not everyone is crazy enough to have a server rack in their garage" yeah i got mine in my bedroom LMAO
yeah got mine in the attic
bad_dragon...I'm freaking dead bro 😂
not everyone is deaf and can do that
lol same 😂
Mine is in my room too!
You can get P100 for $100 only since at least last October if you "Make an Offer"
Not all sellers will accept it, but a few will. I bought 4 of them at $100 last year
That's very good to know, as I might be picking a couple more of these up shortly...
@@CraftComputingwell, maybe after thos video that wont be a thing anymore 😢
@@CraftComputing Every time a medium-large CZcamsr makes a video, prices spike. I doubt they'll remain that accessible for too long now you've published this vid. :P
I got the 12GB for 90, d'oh
Make sure you get the PCIe version and not the SXM2, unless you have an adapter or a server with SXM2 sockets. The SXM2 versions are cheap because of this.
Would be great to see a P4 v P40 v P100 head to head. Having a blend of Cloud Gaming and Ollama performance would be interesting for those looking for a homelab/homegamer/AI tinkerer all-rounder too 👍
P40 just arrived :-) In a couple weeks, I'm going to be testing out every GPU I have on hand for performance and power draw. Stay tuned...
@@CraftComputingI've heard that the P100 doesn't have H.265 support, and only includes a H.264 encoder. If that is the case, then theoretically the P40 should look alot better with Sunshine and Moonlight. Can you test this out and possibly confirm in your next video? This info will make or break which card I end up getting.
@@CraftComputing We look forward to it - thank you for the fun content, it's always an interesting watch @ Craft Computing
Yeah, comparison in dimensions of gaming, general workstation tasks and LLMs would be really awesome
What is cloud gaming
One thing to mention would be that latest supported vgpu driver for the P100 (also other pascal gpus P4/P40) is version 16.4 They dropped support in the latest 17.0/17.1
7:57 Oops, you compared Time Spy on the P100 VM to Time Spy *Extreme* on the Radeon 7600. The 7600 gets ~11,000 in Time Spy, or about twice the single-VM score shown.
The cloud gaming aspect initially got my attention, but I think a lot of us are going to be more curious how they perform running Ollama at home. Looking forward to more of this series.
Not all that great, considering they lack the Tensor cores that have since appeared on newer GPUs since Volta and Turing, which are kinda the reason there’s not a lot of support for Pascal and older GPUs.
Any videos with proxmox and gpus i love to watch! Keep them coming!
We are still patiently awaiting the Hyper-V homelab video you mentioned on talking heads! love the content
This is really really exciting. Thank you for never giving up on this project. This is the exact card I am considering for ML/AI to run in my R720xd.
I have a Tesla P4 in my 720xd
Best series! Love to watch such kind of tinkering!
Great video, always love to see the enterprise hardware for home server use. What are you using for licensing servers for these cards? Are you just using the 90 day trial from nvidia, or are you using some type of fastapi-dls server?
I love your vGPU content Jeff!
You actually inspired me to build a small homelab with a P100, though the CPUs on my server are somewhat older than I would like (a pair E5-2690 v4 CPUs)
Love the vGPU cloud gaming content 👍
All P100 Ebay listings went up 50$ since you posted this video hahaha
These are pretty dang impressive. This might be the coolest/most approachable card you’ve shown so far!
We (my company) owns 5 Dell PowerEdge 740s with two P100-16G-HBM each, but we used VMware vSphere as our hypervisor.
Going on 4+ years, they continue to be excellent and reliable cards - still in active service today for VDI.
With Dell Premier Enterprise pricing, we got them at considerably less than MSRP. It's the ongoing support and maintenance paid periodically to Dell, VMware and NVidia that's the killer.
Pro tip: it's important that you line up the driver versions from the hypervisor down to your guests. That is, the driver version on your guest must be supported by the driver running in the hypervisor. 😅
Great 💪🏽 thinking about an upgrade from the P4 to 10/100
I absolutely need those BEER coasters. Also, I was very lucky to go to Block 15 during the last solar eclipse and had Hypnosis a cognac barrel aged barleywine. Their Nebula and Super Nebula stouts are way more common and still delicious though.
I've seen the sponsor spot for the cloud gaming machine from maximum settings on this channel before, but its kinda ironic that you are setting up your own gaming server afterwards
I've been using my Tesla P4 with an Erying 12800H ITX board as a home server for almost a year now, and I absolutely love it. I have a Win10 vGPU VM running on it, primarily used by my girlfriend, but it's also great when friends come over for a quick LAN session. I was really disappointed when I found out a few weeks ago that NVIDIA dropped Pascal support from the GRID driver.
Just one of those could power an Unreal Tournament 2004 LAN party... Of note, there is also a 12GB P100 -- they don't perform terribly, either.
wow this is great info, I just finished my Epyc Genoa build and was looking for a proper ways to get graphical performant VMs, amazing 👏. Does this also work for Linux VMs?
Jeff, how did you cooled the P100? I've seen your video on cooling the Tesla GPUs but which option did you used in the video ?
Getting one this week. I hopee
Look forward to the next video that compares full single 11:56 performance
the pcie power connector can usually deliver 200+ watts thanks to over spec'd cables but the standard only requires 150
As a fellow Oregonian, how do you mitigate the humidity in your garage? Or is it not bad enough to affect the rack?
Awesome, just got a Tesla M60 today!
I can't wait for SR-IOV to be available on Intel Arc! This would open up a more modern and potentially even cost-effective approach to cloud gaming and VDI solutions. Unfortunately, Arc SR-IOV support is currently only available in the out-of-tree driver.
Just looked up the spec sheet for that P100 and saw that the 16GB memory is what they called “Chip-on-Wafer-on-Substrate.” Very cool.
Good to know all these Liquidations we've been doing for the P100s at the datacenter are going to good use! I've boxed up thousands of these bad boys and off they go
That ultra fast memory makes it interesting for LLMs, could you try that? It's only 16GiB, but really fast at that and cheap so might be a solution for some!
While you are working on those benchmarks, I'd love to see something done with LLMS, say a quick test or two using Mistral and tiny llama on each card?
Have you don’t anything with the Tesla p40? I was wondering how different performance is between the p40 and the p100.
I want to build a Clown Gaming Server, I should be able to get a heck of a lot of Clowns in a Small Form Factor case.
I'm impressed what is possible today. I was a Administrator of a community school in germany 25 years ago. Time is running so fast.
Informative Video. Subscribed!
i wonder if it would be worth it to use this to upgrade an aging pc... most definitely gonna be hard powering it... how did you get around to solving power delivery, in fact?
Thanks for the video! can this gpu work with esxi?
I’m using an Intel ARC 750 for mine. Works really well.
So...Are there any specific games that benifit from increased memory bandwith
Two questions:
1) Are you able to play Halo Infinite with this setup?
2) What client are you using to connect to your system remotely? I am asking because I tried Parsec, and even with an actual, real monitor connected to my 3090, it still stuttered a lot.
Thank you.
Love to see this budget rack content!
Can this gpu be used in a standard gaming PC or only a server rack. It seems for the price it could be a powerful addition to any gaming rig
For gaming i recommend to use a Tesla P40 it have GDDR5X ram but it better GPU frequency and 24 GB good for AI too.
Time spy (Standard) vs Time spy (extreme) results? I suspect you are closer to a standard run on Ryzen 3600 + GTX 1060 @ 4 693 (Graphics Score4 464
CPU Score6 625) but at that point I am splitting hairs. Your result is 2x playable gaming experiences on a single $150-180 enterprise GPU WITH a nice automated script for set up. This is a nice alternative to the P4 especially if the user has only 1 x16 slot.
Can you do a video on modifying and transplanting coolers from (preferably broken / ewaste) nvidia gtx cards onto the tesla p40 / p100 so we don't need to use the jank noisy 3d printed coolers for those of us with servers we need to spend a lot of time close to.
Really hope unraid gets proper vGPU support. I have a p100 but since there is no easy way to get vGPU working, I can only use it for stuff like transcode
question! Are there any modern non-nvidia options for doing 'vgpu' (shared pcie graphics card)? Amd? does intel allow srv-io on their stuff for sharing them? just asking!
The P100s and P40s are very commonplace from China, and inexpensive. They are both decent for running large language models as well. But it's still Geforce 10 era performance, so don't expect wonders.
I have a P100 in my old HPE server, but I've stopped using it because the P100 doesn't have p-states, meaning it can't do any power-saving mode. Now in practice when no load is applied this means the GPU idles at 30-40W which still isn't awful, but when you compare it with other GPUs even from the same generation (such as the P40) which can idle at 5W-8W it's quite the difference (I live in Germany and electricity costs actual money here).
That's on top of my servers' already high idle power. My EPYC GPU Server ***idles*** at 200W without any GPUs installed, so that's a thing..
Can a standard CPU be usable for "cloud" gaming? I have my old 2600x and was wondering if I can get a m40 or similar to pair with it.
How would the P100 perform doing AI video upscaling (such as Video 2X)? I've got several DVD rips (even a couple VHS 'rips') I'd like to try to AI upscale for my Plex server; so if I can throw one of these in my Proxmox box (an HPe DL360p Gen8 SFF) and setup a batch to churn through them without hogging up my gaming machine, that would be nice.
Does this procedure continue to work with PVE 8.2 (Kernel 6.8) update ? I have some doubt ... seems people are getting error compiling some drivers (vg_unlock) ?
I never have understood the preference towards the P100 of the P40. My only assumption is the higher memory bandwidth of the P100 is beneficial for AI workloads.
Hi there, I have been thinking about setting up the system like this myself. Do you by any chance know if there is some weird Nvidia licensing thing, or could these VMs run Linux?
I was lucky about 2 years ago, snagged a Tesla M40 24GB and Tesla P100 for around 90€ both, seller had them as unknown condition. Got both of them running just fine, M40 with modified registry to see it as a high performance GPU, the P100 with a bit more driver fuckery to get it to work.
Have been thinking of flashing the P100 to a Quadro GP100 to see if i can use it like that with just a reg edit aswell, but no luck so far.
Has anyone else encountered the issue when proxmos is installed in a raid that you dont enable IOMMU within Grub?
any chance we can see an updated vgpu tutorial? using the proxmox-vgpu-v3 script.. but cannot seem to get the p4 nor any gpu working with plex hw transcoding in a proxmox ubuntu vm.
I can see them in nvidia-smi in the vm but hw transcoding keeps on failing.
and the old guide just spews out errors etc.. as most guides on yt are 3/4 years old already. Just want to setup a fileserver > plex and then optional windows gaming vm or homeassistant
Just wondering if you could "expand" on the use case testing when you build out the Master List? For example I have a GPU stuffed in a system that I'm going to be using as an Encoder Rig for 264/265 Encoding (cause nv charges too much for AV1 right now) and I'm wondering how that would effect the GPU performance? Or if I were executing LLM testing via oLlama at the time someone were running a game.....
When it comes to testing ollama, you need the amount of VRam that will hold whole model at once otherwise some model layers are delegated to cpu which hinders performance considerably.
Yeah, I fired it up against a 32 core Epyc server.... it was not pretty.... Would be interesting to see how the GPU balancing handles the RAM juggling for that kind of load when split with other non-LLM functions....
3:10 Both GP100 and GP102 have 3840 cores. However, GP100 has only ever been sold with 3584 cores active, while you can buy versions of GP102 with the full die, like p10, p40, p6000 and titan Xp (but not titan X (pascal), that has 3584, just like the 1080ti).
GP100 is a very different microarchitecture despite the same "Pascal" name. It has a 1:2 FP64:FP32 ratio, in contrast to all other Pascal GPUs which have 1:32.
FP64 is only relevant for certain scientific workloads. Today there is only very few GPUs that can till do FP64 with 1:2 ratio, and .Ost are super expensive: P100, V100, A100, H100, B100, MI50, MI60, Radeon VII (Pro), MI100, MI210.
I have 2 of them. I was thinking of getting 2 more a second cpu and a new motherboard. If you want to run a localhost LLM 2 of them work really freaking well. When i was running a local LLM the response time was insane. Im talking within 1 second before it generated a response. Yeah i have 256gb of ram so that helps.
I picked up a cheap Tesla P4 a while back to play with headless gaming. That never worked out too well as streaming seemed to cut too deeply into the tiny 75W max TDP. Instead I have it in an old Haswell Xeon with Windows 11 and with the clocks unlocked it's fantastic. I'm getting well over 60FPS in everything I play, often 90+ at high/med settings. I'd have gotten a P100, but after trying to cool an M40 and failing a couple years ago I decided to keep it simple. Take the shroud off the P4, strap on a fan and you're done.
I would love to see how it compares with a p40
I have a similar setup but with a P4, I can't get the resolution to go above 1360x768. I have tried reinstalling the video driver, and I have reinstalled the virtual display driver and its still giving me trouble. I was thinking about doing a fresh install of Windows but if anyone wants to give their thoughts I would be happy to listen.
Anyone know how this card would do with transcoding for Plex? 4K down to 1080p
How did you get around the degradation that occurs after 20 minutes of use without a purchased license?
Have you fone vr gaming with your cloud gaming setup?
but can we use those to run moder pytorch for nn training?
This is cool and I have a question. How does this differ in practice from using two dedicated GPUs?
For example I have a Proxmox machine with two A380 cards eached dedicated to an auto starting Windows 10 VM. Yes two physical Windows PCs in one SFF case.
Each VM has to have it's own hardware dedicated to it. Yes another VM could use the same hardware but the first one would have to close first.
In the P100 setup do you have to dedicate a particular slice of the P100 to the particular VM like I'm doing with my two PCIe GPUs? Following on from that would it be possible to have a number of non-running VMs that could be started in any order?
You are talking about vgpu. These video accelerators can be split up evenly, meaning, you can allocate resources to vm's in multiples if two, and the VM will think it has its own graphics card.
Meaning 16gb total memory, split once would be 2 gb "video cards" or 4 4gb cards ECT.
The vram will be reserved for each vm, but the host computer will allocate resources for each vm. Meaning, if you only have one VM running, it will see slightly lower then 100% processing utilization. If you have a second VM running, it does not split it an even 50/50 unless both vm's are asking for 100% usage.
Meaning one could just be playing a CZcams video for example and won't need more then 10%, but the other could be playing a game using 90%.
I encourage you to watch Jeff's other videos.
@@lilsammywasapunkrock yes, I comprehend what you've said but I'm still confused by the splitting up. For instance I had a Proxmox with an HD 5950, this card was actually two GPUs, I split that up with one VM dedicated to the first GPU and the other to the second GPU. Never the twain shall meet.
If for instance I had another VM that could use GPU1 I could only run that when GPU1 was not being used by the first mentioned VM.
So with this P100 is one slice of it dedicated to the particular VM I set it up with? It will it pick what it needs dynamically like it does with system RAM. When setting up a VM I don't pass through a particular RAM chip.
@@wayland7150The vram is not dynamically allocated. it is predetermined but the actual Cuda cores running the workload will work in parallel across the VMs as long as the vram is sufficient.
It would be great if you could do a chart with bare metal score in the game then 1 virtual with the game then 2 virtual one game one heaven benchmark. That would let us know what we're in for in all scenarios.
Remembering I have one of these in my dormant promos box. Time to fire it up
Ignore me, i forgot GP100 was not just a larger pascal die with everything enlarged, though it is 610mm like a normal Titan/80TI, it only has the same number of shaders as the 471mm Titan XP
The card that would have been the Titan XP and possibly the 1080TI had AMD been able to compete with Pascal
Instead AMD's 500+mm die with HBM based memory, couldnt compete with the 300mm+ 60 class GTX 1080 with GDDR based memory, though it did have a 30% wider bus than a normal 60 class card.
Can you split the 16GB video memory into 12GB/4GB VM’s?
amazon luna uses the tesla cards for their cloud gaming service. I believe its the t4 but i cant remember which one exactly but can double check again the way you find out is doing a bench mark in some games it will list the specs. They use crap xeon cpus though.
I guess I leave this comment as a heads up that PVE 8.2 (Kernel 6.8) breaks nvidia driver compile, and I'll add that I'd love to see a video about how to fix it.
has there been any word on how to work around this?
I believe the tesla P40 is actually the same silicon as the titan xp which would be somewhat faster than the P100 for gaming id expect. To the best of my knowledge the P100's superior memory performance is really only meaningful for things like AI inferencing workloads. Either way both cards are around the same price these days which is pretty cool. edit: though perhaps the extra BW is beneficial for cloud gaming, I don't have any experience there.
Yes P40 and Titan Xp are identical, except doubled 24GB capacity on P40.
The P100 is a very different microarchitecture from the rest of Pascal: it supports FP64 with a 1:2 ratio. All other Pascal cards only do 1:32 ratio.
Tbh this mostly verifies that having a dedicated GPU per VM might still be the best option if they are all going to be gaming. Sure this is a much more cost effective option than 4 1080s or whatever, but move up on the scale having 4 RTX 4090's is much more cost effective than an L40/RTX 6000 Ada (with the sweetspot being somewhere along the way) and you gain a lot of performance. I do wish it was easier to have a GPU serve both a VM and your containers at the same time on modern consumer GPUs.
Does anyone have any idea of how well the P40 / P100 would play Star Citizen? I know that's a bit of a complicated question, but wondering if anyone that has one has tried.
o lord he drinks beer again!!
How a V100 would compare with a P100 or a P40 ?
I was on this all day, I still have no output when I do mdevctl types... I don't want to give up yet, im very new to proxmox, I am doing it on a fresh install, intel i7 7700K and an RXT 2060 Super which supposed to be compatible, I run the script, all goes well, but no mdev
Which PDXLAN is that shirt from?
crazy cool chit maynge!
8:32 for two of them, Cryses was right there =^.^=
Almost only used in the P100, there is also the GP100 Quadro card
You forgot about video rendering in jellyfin, pled and emby...
The EPS connector is so better at power that the RTX A6000 only needs a single connector instead of multiple 8 pin connectors or that abomination that is the 12VHPWR connector. And its TDP is only 100W less than the 4090.
Looks over at the P4... You still up to this challenge?
I love my Nvidia p40!
Would a amd Mi25 be just as good? Seem pretty similar besides the memory bus width
I picked one up for $60 off of ebay and havent decided what i wanted to do with it lol
I have a couple Mi25s. Unfortunately AMD has never publicly released their MxGPU drivers for them.
I'd like to see Jeff get a hold of some of those Intel flex GPUs.
Me too fam. Me too.
Any recommendations when you don't have extra EPS power connectors but plenty of pcie connectors?
You can get adapters for 2x8-pin PCIe to 8-pin EPS. Current is the same (300W) for them.
2:19 Well, the pcie 8 Pin is RATED for 150 Watts, but I think it's widely accepted that it is in fact much more capable than that.
can the 3090 be patched for vgpu use?
got 3 of them how do you get these drivers are they paid? do you really need windows vm for games or would a bare metal headless linux server do?
The script, uh, *ahem* installs the drivers for you.
Have you seen that 12gb tesla m40s are like $50 on ebay?
Yes I have! Tesla M60s too! I'm going to be re-reviewing those cards shortly.
The P100 is a beast in FP64 compute, it smokes all of the newer cards there, 3.7x faster than even an RTX 4090.
P100 is a very different microarchitecture from the rest of Pascal cards, with 1:2 FP64:FP32 ratio. Today this is only seen in A100/H100 data-center cards which cost upwards of $10k. FP64 is required for certain scientific workloads like orbit calculation for satellites.
How do you power one of these? None of my cables seem to match the connector
You didn't watch the video or didn't pay close enough attention. It's EPS 12v, not 8 pin PCIe. Many data center GPUs use EPS 12v instead of 8 pin PCIe. Anyway, you need an 8 pin EPS 12v connector on your power supply or and adapter that takes 2 PCIe connectors and powers feeds an EPS 12v connector. EPS 12v was used because it can deliver more power.
So when will NVIDIA release Grace Hopper vGPU? The upscaling alone will be worth it
Never. Hopper doesn't support graphics APIs.
Dream server