Official PyTorch Documentary: Powering the AI Revolution

Deep Dive on PyTorch Quantization - Chris Gottbrath

ONNX and ONNX Runtime

I wish I could change THIS fast! 🤣

Is A 2-Sided Polygon Possible?

DO YOU HAVE FRIENDS LIKE THIS?

Quantization - Dmytro Dzhulgakov

PyTorch

zhlédnutí 9 224

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 28. 06. 2024
It’s important to make efficient use of both server-side and on-device compute resources when developing ML applications. To support more efficient deployment on servers and edge devices, PyTorch 1.3 now supports 8-bit model quantization using the familiar eager mode Python API.
Věda a technologie

Komentáře • 7

@digitaldreamer8637 Před 3 lety
Excellent work. Very clear 👍🏼. I think Tesla needs help with Int8 Quantization. 😉
@xXMockapapellaXx Před 4 lety
Thank you for the talk. It's good to see a focused video on the quantization efforts for PyTorch.
While I know this video is kind of old, I've been looking for a way to quantize GPT-2 XL for use on a GPU server (not mobile, mainly due to its size and computation requirements). I explain it in much better detail in this GitHub issue on huggingface's transformers repo: github.com/huggingface/transformers/issues/2466, but basically when I try to save the models for later use the file size gets bigger and performance gets worse (text repeats A LOT when it shouldn't with a variety of different prompts).
@PyTorch Před 4 lety
Hello. For help, please join and post in the PyTorch Forums: discuss.pytorch.org
@ameynaik2743 Před 2 lety
czcams.com/video/IPQmGzYuxmc/video.html - What does this mean? Folding batch norm computation into convolution?
@MrDeyzel Před 3 lety
Fusing the ResNet50 models like that doesn't work.
@dzhulgakov Před 3 lety ⁺⁴
What is the exact problem you encounter? You can try to ask at pytorch forums (discuss.pytorch.org/) or create a github issue.
Maybe some of the minor things in APIs have changed since the talk was given, but generally it should work. Specifically you can refer to the following:
- quantization tutorial (talks about MobileNetV2 instead of ResNet, but the idea is the same): pytorch.org/tutorials/advanced/static_quantization_tutorial.html
- specifically for ResNet, there are already quantized models in TorchVision: pytorch.org/blog/introduction-to-quantization-on-pytorch/#integration-in-torchvision
- ResNet50 specifically: github.com/pytorch/vision/blob/master/torchvision/models/quantization/resnet.py#L151
- tutorial for using them: pytorch.org/tutorials/intermediate/quantized_transfer_learning_tutorial.html
@Khan0156 Před rokem ⁺¹
Why most of Data Scientists on talks like this can't speak english properly?

Další v pořadí

Automatické přehrávání

Official PyTorch Documentary: Powering the AI Revolution

Official PyTorch Documentary: Powering the AI Revolution

Deep Dive on PyTorch Quantization - Chris Gottbrath

Deep Dive on PyTorch Quantization - Chris Gottbrath

ONNX and ONNX Runtime

ONNX and ONNX Runtime

I wish I could change THIS fast! 🤣

I wish I could change THIS fast! 🤣

Is A 2-Sided Polygon Possible?

Is A 2-Sided Polygon Possible?

DO YOU HAVE FRIENDS LIKE THIS?

DO YOU HAVE FRIENDS LIKE THIS?

ROCK PAPER SCISSOR! (55 MLN SUBS!) feat @PANDAGIRLOFFICIAL #shorts

ROCK PAPER SCISSOR! (55 MLN SUBS!) feat @PANDAGIRLOFFICIAL #shorts

Guest Lecture: Vector Quantization Techniques with Etienne | Brown University CSCI

Guest Lecture: Vector Quantization Techniques with Etienne | Brown University CSCI

Quantization explained with PyTorch - Post-Training Quantization, Quantization-Aware Training

Quantization explained with PyTorch - Post-Training Quantization, Quantization-Aware Training

LLaMa GPTQ 4-Bit Quantization. Billions of Parameters Made Smaller and Smarter. How Does it Work?

LLaMa GPTQ 4-Bit Quantization. Billions of Parameters Made Smaller and Smarter. How Does it Work?

Quantization vs Pruning vs Distillation: Optimizing NNs for Inference

Quantization vs Pruning vs Distillation: Optimizing NNs for Inference

Ilya Sutskever: OpenAI Meta-Learning and Self-Play | MIT Artificial General Intelligence (AGI)

Ilya Sutskever: OpenAI Meta-Learning and Self-Play | MIT Artificial General Intelligence (AGI)

7 PyTorch Tips You Should Know

7 PyTorch Tips You Should Know

Which Quantization Method is Right for You? (GPTQ vs. GGUF vs. AWQ)

Which Quantization Method is Right for You? (GPTQ vs. GGUF vs. AWQ)

TorchScript and PyTorch JIT | Deep Dive

TorchScript and PyTorch JIT | Deep Dive

What is Sparsity?

What is Sparsity?

LONGEVITY - když chceš žít dlouho! WHOOP, Oura i UltraHuman! Který je pro tebe?

LONGEVITY - když chceš žít dlouho! WHOOP, Oura i UltraHuman! Který je pro tebe?

YouTube's Adblock War Just Got Way Worse...

YouTube's Adblock War Just Got Way Worse...

Hisense Official Flagship Store Hisense is the champion What is going on?

Hisense Official Flagship Store Hisense is the champion What is going on?

Prodal ledvinu, aby si koupil iPhone

Prodal ledvinu, aby si koupil iPhone

The Weird, Terrible Smartphones They Only Have in North Korea

The Weird, Terrible Smartphones They Only Have in North Korea

Using Your phone in the Rain 💀.

Using Your phone in the Rain 💀.