Step-By-Step Handwriting Words Recognition With PyTorch

Sdílet
Vložit
  • čas přidán 19. 03. 2023
  • In this tutorial, we will extend the previous tutorial to build a custom PyTorch model using the IAM Dataset for recognizing handwritten text. This dataset is commonly used as a benchmark for OCR systems and can provide a valuable foundation for constructing your own OCR system. We will be using several machine learning libraries and techniques to preprocess the data, augment it, and train a deep learning model.
    During this tutorial, we will cover the following:
    - An overview of the IAM Dataset and handwritten text recognition;
    - Code walkthrough for importing required modules and libraries;
    - Downloading and extracting the dataset using the download_and_unzip function;
    - Preprocessing the dataset, including data parsing, vocab set creation, and maximum label length;
    - Data augmentation techniques to improve model performance;
    - A deep dive into PyTorch model training with custom CTC loss function and callbacks;
    - Evaluation metrics like CER and WER to monitor training progress;
    - Saving and exporting the trained PyTorch model in ONNX format.
    By the end of this tutorial, you will have a good understanding of how to train a custom PyTorch model for recognizing handwritten text using the IAM Dataset. Join me in this exciting journey of handwriting recognition with PyTorch!
    Text Version Tutorial: pylessons.com/pytorch-wrapper
    GitHub: github.com/pythonlessons/mltu...
    pypi: pypi.org/project/mltu/
    #machinelearning #python #pytorch #ocr #tensorflow

Komentáře • 64

  • @ekchills6948
    @ekchills6948 Před rokem +2

    Thank you so much but please can you tell how I can use my inputs to test it I've already trained with a different dataset

  • @amigo2hundred
    @amigo2hundred Před rokem

    The text version of the tutorial has a google drive link at the end containing the trained model but I am unable to get it running
    can I get some help ?

  • @user-yx4bt5wq7g
    @user-yx4bt5wq7g Před rokem +1

    Thank for the video ,I wanna use your code, but I have a large word dataset should change anything to you code when training?

    • @PyLessons
      @PyLessons  Před rokem +1

      I am not sure, it depends on your dataset, but you shouldn't need to do huge changes I think, it depends how it trains

  • @user-rw6uy8pn8i
    @user-rw6uy8pn8i Před rokem

    this works well with the dataset images but if i pass some other word images not from the dataset then it cant predict. same thing happens with the tensorflow model as well. Am i doing something wrong?

    • @PyLessons
      @PyLessons  Před rokem

      no, your example should be at least similar to examples that are in training data. Usually you would need to combine several large datasets and train model from them, so then model would be more robust

  • @science3605
    @science3605 Před rokem

    Thank you so much! very well explained. But I'm getting while trying to download dataset, it show error "HTTPerror: Bad Gateway"
    Please help me in this regard if possible

    • @PyLessons
      @PyLessons  Před rokem

      Hello, link is not working anymore, I'll try to find new link when I'll find time

  • @sahilpawar5152
    @sahilpawar5152 Před rokem

    This works if image only contains 1 word or sentence (like 1 in your tensorflow video), but what if I want to train it on document like form or invoice what should I do?

    • @PyLessons
      @PyLessons  Před rokem +1

      Predict straight from large document is way harder task, you will need way larger dataset and model that you will need to train for months, but if you have this kind of resources its up to you :) This is why all solutions implement this in smaller steps

    • @bomxacalaka2033
      @bomxacalaka2033 Před měsícem

      find a way to crop each word. Ive done this in a website with live view, using opencv it finds possible words and crops only that bit of each frame, then you can also straighten the image and apply erosion then dilation. OpenCV has a lot of tools to help with that. I got a few functions here like dilate, findContours, boundingRect, contourArea. There are more to prepare the image but these are the main ones to find individual words.

  • @jahstinarguedas799
    @jahstinarguedas799 Před 9 měsíci

    Hello, I tell you that I should try to do the first thing, having the minimum required to start with the code. This is the import of the libraries but I get error after error, did you already have those libraries installed before? Or did you install them for this video?

    • @jahstinarguedas799
      @jahstinarguedas799 Před 9 měsíci

      I would like to create a system capable of recognizing handwritten text, do you recommend pytorch or tensorflow?

    • @PyLessons
      @PyLessons  Před 9 měsíci +1

      Hello, it depends what OS you use and if you have GPU on your machine. PyTorch is easier to learn and easier to run on all OS systems. TensorFlow is harder to learn and with latest versions it's pretty hard to install it on Windows with GPU support. People who are programming on Windows are shifting to PyTorch because of easier setup

  • @mahmoudelsayed8073
    @mahmoudelsayed8073 Před rokem

    Hello. Thank you for the tutorial! I attempted to run the code on my end, but I get a 502 bad gateway for dataset link provided. Was the link changed?

    • @PyLessons
      @PyLessons  Před rokem

      Your welcome. No everything works just fine for me. fki.tic.heia-fr.ch/databases/download-the-iam-handwriting-database

    • @mahmoudelsayed8073
      @mahmoudelsayed8073 Před rokem

      @@PyLessons I ended up adjusting the path in the training code to point to my local copy of the dataset instead of downloading, and it seems to be working fine so far. Thank you for the help and the great tutorial/source code!

    • @ayenipeace8887
      @ayenipeace8887 Před rokem

      @@PyLessons it still doesn't work in my case. Same for the new link that you have shared. Could you kindly check it please. I cannot find the words.txt file even after unzipping the dataset.

  • @user-jl2kg8em1q
    @user-jl2kg8em1q Před 6 měsíci

    Hi. Can I get your trained model by any chance?

  • @yashkewlani2878
    @yashkewlani2878 Před rokem

    Can you please tell me how can we take input from our side after training the model with datasets ??

    • @PyLessons
      @PyLessons  Před rokem

      Its pretty simple, I gave another file where I test it, modify it

  • @rishabh2906
    @rishabh2906 Před 4 měsíci

    hey how can I use nougat to make it work more efficiently with maths and other things to any idea?

  • @aspboss1973
    @aspboss1973 Před 6 měsíci

    Great video !
    Question - What if we want to extract text from image, (Not hand written) ? Will the same model work ?

    • @PyLessons
      @PyLessons  Před 6 měsíci

      Thanks! Yes, it should work :)

    • @aspboss1973
      @aspboss1973 Před 6 měsíci

      @@PyLessons what if we want to extract sentences? Will the model be able to put words in sequence?

    • @PyLessons
      @PyLessons  Před 6 měsíci

      @@aspboss1973 when I was trying it, longer sentences harder to train it. It's way easier to use another techniques to separate words from sentences, predict and then combine

    • @aspboss1973
      @aspboss1973 Před 6 měsíci

      @@PyLessons So this technique won't be able to capture 8-10 word long scentences.

  • @arifzanko
    @arifzanko Před rokem

    ModuleNotFoundError: No module named 'mltu.torch.losses'
    I already install mltu==1.0.1, but still didn't work

  • @nareshmalviya3100
    @nareshmalviya3100 Před 4 měsíci +1

    @PyLessons when i try to execute fit method
    I got error
    UnboundLocalError : cannot access local variable 'loss_info' where it is not associated with a value

    • @PyLessons
      @PyLessons  Před 4 měsíci

      I assume you are using latest version of mltu package, you found a bug with my latest release, thanks, going to fix it asap

    • @nareshmalviya3100
      @nareshmalviya3100 Před 4 měsíci +1

      @@PyLessons thanks to you, your content are really helpful.

    • @PyLessons
      @PyLessons  Před 4 měsíci

      Thanks!

    • @PyLessons
      @PyLessons  Před 4 měsíci

      Released a bug fix, now you can install 1.2.2 version and everything should be fine

    • @nareshmalviya3100
      @nareshmalviya3100 Před 4 měsíci

      If i want to contact with you regarding some task. How will i do?

  • @pasinduminiruwan4990
    @pasinduminiruwan4990 Před 3 měsíci

    Hello Thank you very much for your content. Can I please know that can I use this code foridentify handwritten text in a full page

    • @PyLessons
      @PyLessons  Před 3 měsíci

      Hello, you are welcome. Attach some kind of hand written text boject detector, and try to solve task in that way

    • @pasinduminiruwan4990
      @pasinduminiruwan4990 Před 3 měsíci

      @@PyLessons thank you very much 🙌

  • @hollybollyentertainer8097
    @hollybollyentertainer8097 Před 2 měsíci

    Hello👋 can you please attach the links of latest datasets that are available. It would be a great help because i have project deadline within a week😅

    • @PyLessons
      @PyLessons  Před 2 měsíci

      You may be not able to access dataset website from your location, try to use vpn to access dataset

  • @adamofucci4558
    @adamofucci4558 Před rokem +2

    Grazie.

  • @ruckydelmoro2500
    @ruckydelmoro2500 Před rokem

    How can i modify the code to process the data once?

    • @ruckydelmoro2500
      @ruckydelmoro2500 Před rokem

      Because i want to improve the model so it's time consuming when i train it and stop and train it again i waiting the data to process

    • @PyLessons
      @PyLessons  Před rokem +1

      It does process it only once, since you use cache as True (stores images in memory). But if you don't want to use augmentors, you may remove these lines, but model trains better with them

    • @ruckydelmoro2500
      @ruckydelmoro2500 Před rokem

      @@PyLessons How long it takes to finish 1000? because in my computer it takes 4-5mins for 1epoch.

    • @ruckydelmoro2500
      @ruckydelmoro2500 Před rokem

      Btw. thanks for the tutorial. i have a task now for text recognition and your tutorial is very helpful.

    • @PyLessons
      @PyLessons  Před rokem

      @@ruckydelmoro2500 You can use math :) but it won't take 1000 epochs if you use validation data, early stopping will work

  • @bomxacalaka2033
    @bomxacalaka2033 Před 2 měsíci +1

    what are your specs?

    • @PyLessons
      @PyLessons  Před 2 měsíci +1

      As I remember I used I7-7700k CPU and 1080TI GPU :)

    • @bomxacalaka2033
      @bomxacalaka2033 Před měsícem

      @@PyLessons just to let you know, ive created a korean dataset with 50k images, and trained using your script, got an average CER: 0.063.
      Also my code to create the dataset was done on a rush so the dataset looks horrible and took 15 hours to create it, but somehow its able to recognise most things i write down. Next ist to train the english model with korean and see what happens.

    • @PyLessons
      @PyLessons  Před měsícem

      Sounds great! Good job, keep doing this stuff :) if you have any suggestions for improvements, let me know!

  • @riswangp
    @riswangp Před rokem

    i couldn't install the mltu

    • @PyLessons
      @PyLessons  Před rokem

      Why? what OS, what python version?

    • @riswangp
      @riswangp Před rokem

      @@PyLessons using python 3.9.16 using mac os, regarding to the tutorial i was tryna to install mltu=1.0.3

    • @riswangp
      @riswangp Před rokem

      @@PyLessons i tried to not specify the version but it’s still not working

    • @riswangp
      @riswangp Před rokem

      it says
      note: This error originates from a subprocess, and is likely not a problem with pip.
      error: metadata-generation-failed

  • @chatheriyanthangaraju2372

    Why can't you use a colab file and share it with us... I will be very useful for us

    • @PyLessons
      @PyLessons  Před rokem +2

      You are not learning while using colab, I found out its better practice to use pure python script, if you want to do step by step, experiment in debugger

  • @upppvr4280
    @upppvr4280 Před 8 měsíci

    I need your help. How we can contact you?

    • @PyLessons
      @PyLessons  Před 8 měsíci

      You can find my contacts on www.pylessons.com