Inside TensorFlow: Quantization aware training

Sdílet
Vložit
  • čas přidán 29. 06. 2024
  • In this episode of Inside TensorFlow, Software Engineer Pulkit Bhuwalka presents quantization aware training. Pulkit will take us through the fundamentals of quantization aware training, TensorFlow/Keras API used to achieve this, and how it is implemented during this tutorial.
    Documentation → goo.gle/32MN60q
    Github → goo.gle/30YihDB
    Add the Inside TensorFlow playlist → goo.gle/Inside-TensorFlow
    Subscribe to the TensorFlow channel → goo.gle/TensorFlow
  • Věda a technologie

Komentáře • 38

  • @autripat
    @autripat Před 3 lety +4

    Hey all, at 14:26, are we missing the quantize_annotate_layer wrapper over the Conv2d layer (inside Sequential), like this:
    quantize_annotate_layer(tf.keras.layers.Conv2d(32, 5, input_shape=(28,28,1))

  • @foolmarks
    @foolmarks Před 3 lety +5

    GitHub link doesn't work. Audio is terrible.

  • @athreyamurali1439
    @athreyamurali1439 Před 3 lety +8

    Hey can you re-upload with better audio, please?

    • @alias15vapour
      @alias15vapour Před 3 lety

      Sorry about that. I recorded the audio locally so it's better, but forgot Airpods audio compression over bluetooth lost quality.

    • @athreyamurali1439
      @athreyamurali1439 Před 3 lety

      @@alias15vapour All good, it happens. The topic seems really interesting tho, so I'd really appreciate it if you could re-upload or re-record it sometime. Thanks!

    • @alias15vapour
      @alias15vapour Před 3 lety +1

      @@athreyamurali1439 - Thanks. This takes a bunch of post-production work, so a bit unlikely tbh but me (or someone else on the team) will definitely do this and a better job for the next version.

  • @sunnyguha2
    @sunnyguha2 Před 3 lety +12

    Get better microphone

  • @morekaccino
    @morekaccino Před 3 lety +14

    I can't hear anything

  • @sanjoetv5748
    @sanjoetv5748 Před 3 měsíci

    i am having a problem when i convert my .h5 to tflite,, when i test the tflite on my mobile app the accuracy is so much lower than when i try to run the .h5 on jupyter.... my question is does quantization aware training can help me to lower the accuracy loss when converted it to tflite after the quantization aware training?
    please someone help!!!

  • @PremKumar-qi3cd
    @PremKumar-qi3cd Před 3 lety

    When I try to post-quantize(int8) the SimpleRNN model for a time series data, it is throwing an error saying only single graph is supported. So Does the RNN, LSTMs support for quantization and conversion to tflite models? And If yes, how can I address the error? Thanks in advance.:)

  • @Hav0c1000
    @Hav0c1000 Před 3 lety

    Hey Pulkit,
    Say I wanted to constrain quantization parameters to power of 2 values. Would that be supported?

  • @raisaalphonse4094
    @raisaalphonse4094 Před 3 lety

    I'm using QAT for a functional model only, but I'm getting a value error saying,
    quantize_model
    '`to_quantize` can only either be a tf.keras Sequential or '
    ValueError: `to_quantize` can only either be a tf.keras Sequential or Functional model.
    I'm not really sure why I'm getting this error. Could anyone please help me out in this?

  • @shubhammane6357
    @shubhammane6357 Před rokem

    I tried QAT, as result I got .h5 model with quantize wrapper layers, I want to remove it and get back my original model with modified weights, How can I dot that?

  • @ramamunireddyyanamala973

    Very good Sir

  • @anishdeepak1826
    @anishdeepak1826 Před 2 lety

    I have trained ssd_mobilenet_v2 model using object detection api and saved the model as .pb file. How to apply the quantization to a my model. I dont have .h5 file.

  • @rupeshmohanasundaram6718
    @rupeshmohanasundaram6718 Před 3 měsíci

    for object detection, QAT Works, if so how?

  • @Lisa-hb3js
    @Lisa-hb3js Před 3 měsíci

    I got this error whatever I do (the same if the network only contains Dense layers...) : ValueError: Unable to clone model. This generally happens if you used custom Keras layers or objects in your model. Please specify them via `quantize_scope` for your calls to `quantize_model` and `quantize_apply`. [Layer supplied to wrapper is not a supported layer type. Please ensure wrapped layer is a valid Keras layer.].

  • @sreeragm8366
    @sreeragm8366 Před 3 lety

    Is there any scenarios in which quantisation shouldn't be done? Like, Incase I want to convert it to other formats supporting optimization, such as TensorRT.

    • @alias15vapour
      @alias15vapour Před 3 lety

      That depends on your needs. If you want to use TensorRT for optimization that works fine as well. Quantization is useful if performance is a concern for you.

  • @lisali6120
    @lisali6120 Před 3 lety +1

    Thanks for sharing! Does it support mixed precision?

    • @bryanlozano8905
      @bryanlozano8905 Před 3 lety +1

      it should, he mentioned custom quantization for specific layers

    • @alias15vapour
      @alias15vapour Před 3 lety

      QAT does emulation of model execution in certain precisions so model accuracy is preserved. If that's your goal, you can totally do it like Bryan mentioned. But it's unlike mixed precision for training.

  • @nataliameira2283
    @nataliameira2283 Před 3 lety

    Documentation → goo.gle/2WMUZze ---> ERROR (Sorry, we couldn't find that page.)

  • @yoloswaggins2161
    @yoloswaggins2161 Před 3 lety +1

    Can this be used for tensor cores on Nvidia GPUs or is it only for embedded devices?

    • @alias15vapour
      @alias15vapour Před 3 lety +1

      By default it supports the TFLite Quantization spec. If you want to use it for Nvidia, you would have to write custom quantization configs specific to NVidia. But it absolutely can be done.

    • @yoloswaggins2161
      @yoloswaggins2161 Před 3 lety +1

      @@alias15vapour Thanks for the answer, would that be writing CUDA kernels for this or could you wrap with something higher level like Tensorrt?

    • @alias15vapour
      @alias15vapour Před 3 lety +1

      ​@@yoloswaggins2161 You wouldn't need to write any kernels. You would just need to arrange the TF graph in a way that it emulates the quantization on Nvidia chips. It would just reuse their existing kernels.
      Possible to use TensorRT but you would need to know deep internals of TensorRT to construct the graph correctly.

    • @yoloswaggins2161
      @yoloswaggins2161 Před 3 lety +1

      @@alias15vapour I see, thank you.

  • @travelsome
    @travelsome Před 3 lety

    Waiting for a video for sequential modelling

    • @rushikeshgandhmal
      @rushikeshgandhmal Před 3 lety

      Hey how should I start start learning Deep learning ? Could you suggest me?

    • @gokulakrishnanm
      @gokulakrishnanm Před 9 měsíci

      @@rushikeshgandhmalhow’s your learning journey🎉

  • @sairamvarma6208
    @sairamvarma6208 Před 3 lety

    The Github link in the description doesn't work

    • @alias15vapour
      @alias15vapour Před 3 lety +1

      Sorry about that, there's a typo. Just use the link below.

  • @bryanlozano8905
    @bryanlozano8905 Před 3 lety +2

    Bruh, is someone weed-whacking outside?

    • @alias15vapour
      @alias15vapour Před 3 lety

      Unfortunately, yes. They started that the moment I started recording :(