Step-By-Step Handwriting Words Recognition With PyTorch

Python Lessons

zhlédnutí 13 238

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 19. 03. 2023
In this tutorial, we will extend the previous tutorial to build a custom PyTorch model using the IAM Dataset for recognizing handwritten text. This dataset is commonly used as a benchmark for OCR systems and can provide a valuable foundation for constructing your own OCR system. We will be using several machine learning libraries and techniques to preprocess the data, augment it, and train a deep learning model.
During this tutorial, we will cover the following:
- An overview of the IAM Dataset and handwritten text recognition;
- Code walkthrough for importing required modules and libraries;
- Downloading and extracting the dataset using the download_and_unzip function;
- Preprocessing the dataset, including data parsing, vocab set creation, and maximum label length;
- Data augmentation techniques to improve model performance;
- A deep dive into PyTorch model training with custom CTC loss function and callbacks;
- Evaluation metrics like CER and WER to monitor training progress;
- Saving and exporting the trained PyTorch model in ONNX format.
By the end of this tutorial, you will have a good understanding of how to train a custom PyTorch model for recognizing handwritten text using the IAM Dataset. Join me in this exciting journey of handwriting recognition with PyTorch!
Text Version Tutorial: pylessons.com/pytorch-wrapper
GitHub: github.com/pythonlessons/mltu...
pypi: pypi.org/project/mltu/
#machinelearning #python #pytorch #ocr #tensorflow

Komentáře • 64

@ekchills6948 Před rokem ⁺²
Thank you so much but please can you tell how I can use my inputs to test it I've already trained with a different dataset
@amigo2hundred Před rokem
The text version of the tutorial has a google drive link at the end containing the trained model but I am unable to get it running
can I get some help ?
@user-yx4bt5wq7g Před rokem ⁺¹
Thank for the video ,I wanna use your code, but I have a large word dataset should change anything to you code when training?
@PyLessons Před rokem ⁺¹
I am not sure, it depends on your dataset, but you shouldn't need to do huge changes I think, it depends how it trains
@user-rw6uy8pn8i Před rokem
this works well with the dataset images but if i pass some other word images not from the dataset then it cant predict. same thing happens with the tensorflow model as well. Am i doing something wrong?
@PyLessons Před rokem
no, your example should be at least similar to examples that are in training data. Usually you would need to combine several large datasets and train model from them, so then model would be more robust
@science3605 Před rokem
Thank you so much! very well explained. But I'm getting while trying to download dataset, it show error "HTTPerror: Bad Gateway"
Please help me in this regard if possible
@PyLessons Před rokem
Hello, link is not working anymore, I'll try to find new link when I'll find time
@sahilpawar5152 Před rokem
This works if image only contains 1 word or sentence (like 1 in your tensorflow video), but what if I want to train it on document like form or invoice what should I do?
@PyLessons Před rokem ⁺¹
Predict straight from large document is way harder task, you will need way larger dataset and model that you will need to train for months, but if you have this kind of resources its up to you :) This is why all solutions implement this in smaller steps
@bomxacalaka2033 Před měsícem
find a way to crop each word. Ive done this in a website with live view, using opencv it finds possible words and crops only that bit of each frame, then you can also straighten the image and apply erosion then dilation. OpenCV has a lot of tools to help with that. I got a few functions here like dilate, findContours, boundingRect, contourArea. There are more to prepare the image but these are the main ones to find individual words.
@jahstinarguedas799 Před 9 měsíci
Hello, I tell you that I should try to do the first thing, having the minimum required to start with the code. This is the import of the libraries but I get error after error, did you already have those libraries installed before? Or did you install them for this video?
@jahstinarguedas799 Před 9 měsíci
I would like to create a system capable of recognizing handwritten text, do you recommend pytorch or tensorflow?
@PyLessons Před 9 měsíci ⁺¹
Hello, it depends what OS you use and if you have GPU on your machine. PyTorch is easier to learn and easier to run on all OS systems. TensorFlow is harder to learn and with latest versions it's pretty hard to install it on Windows with GPU support. People who are programming on Windows are shifting to PyTorch because of easier setup
@mahmoudelsayed8073 Před rokem
Hello. Thank you for the tutorial! I attempted to run the code on my end, but I get a 502 bad gateway for dataset link provided. Was the link changed?
@PyLessons Před rokem
Your welcome. No everything works just fine for me. fki.tic.heia-fr.ch/databases/download-the-iam-handwriting-database
@mahmoudelsayed8073 Před rokem
@@PyLessons I ended up adjusting the path in the training code to point to my local copy of the dataset instead of downloading, and it seems to be working fine so far. Thank you for the help and the great tutorial/source code!
@ayenipeace8887 Před rokem
@@PyLessons it still doesn't work in my case. Same for the new link that you have shared. Could you kindly check it please. I cannot find the words.txt file even after unzipping the dataset.
@user-jl2kg8em1q Před 6 měsíci
Hi. Can I get your trained model by any chance?
@yashkewlani2878 Před rokem
Can you please tell me how can we take input from our side after training the model with datasets ??
@PyLessons Před rokem
Its pretty simple, I gave another file where I test it, modify it
@rishabh2906 Před 4 měsíci
hey how can I use nougat to make it work more efficiently with maths and other things to any idea?
@PyLessons Před 4 měsíci
No idea
@aspboss1973 Před 6 měsíci
Great video !
Question - What if we want to extract text from image, (Not hand written) ? Will the same model work ?
@PyLessons Před 6 měsíci
Thanks! Yes, it should work :)
@aspboss1973 Před 6 měsíci
@@PyLessons what if we want to extract sentences? Will the model be able to put words in sequence?
@PyLessons Před 6 měsíci
@@aspboss1973 when I was trying it, longer sentences harder to train it. It's way easier to use another techniques to separate words from sentences, predict and then combine
@aspboss1973 Před 6 měsíci
@@PyLessons So this technique won't be able to capture 8-10 word long scentences.
@arifzanko Před rokem
ModuleNotFoundError: No module named 'mltu.torch.losses'
I already install mltu==1.0.1, but still didn't work
@nareshmalviya3100 Před 4 měsíci ⁺¹
@PyLessons when i try to execute fit method
I got error
UnboundLocalError : cannot access local variable 'loss_info' where it is not associated with a value
@PyLessons Před 4 měsíci
I assume you are using latest version of mltu package, you found a bug with my latest release, thanks, going to fix it asap
@nareshmalviya3100 Před 4 měsíci ⁺¹
@@PyLessons thanks to you, your content are really helpful.
@PyLessons Před 4 měsíci
Thanks!
@PyLessons Před 4 měsíci
Released a bug fix, now you can install 1.2.2 version and everything should be fine
@nareshmalviya3100 Před 4 měsíci
If i want to contact with you regarding some task. How will i do?
@pasinduminiruwan4990 Před 3 měsíci
Hello Thank you very much for your content. Can I please know that can I use this code foridentify handwritten text in a full page
@PyLessons Před 3 měsíci
Hello, you are welcome. Attach some kind of hand written text boject detector, and try to solve task in that way
@pasinduminiruwan4990 Před 3 měsíci
@@PyLessons thank you very much 🙌
@hollybollyentertainer8097 Před 2 měsíci
Hello👋 can you please attach the links of latest datasets that are available. It would be a great help because i have project deadline within a week😅
@PyLessons Před 2 měsíci
You may be not able to access dataset website from your location, try to use vpn to access dataset
@adamofucci4558 Před rokem ⁺²
Grazie.
@PyLessons Před rokem ⁺¹
Thank you for your support!
@ruckydelmoro2500 Před rokem
How can i modify the code to process the data once?
@ruckydelmoro2500 Před rokem
Because i want to improve the model so it's time consuming when i train it and stop and train it again i waiting the data to process
@PyLessons Před rokem ⁺¹
It does process it only once, since you use cache as True (stores images in memory). But if you don't want to use augmentors, you may remove these lines, but model trains better with them
@ruckydelmoro2500 Před rokem
@@PyLessons How long it takes to finish 1000? because in my computer it takes 4-5mins for 1epoch.
@ruckydelmoro2500 Před rokem
Btw. thanks for the tutorial. i have a task now for text recognition and your tutorial is very helpful.
@PyLessons Před rokem
@@ruckydelmoro2500 You can use math :) but it won't take 1000 epochs if you use validation data, early stopping will work
@bomxacalaka2033 Před 2 měsíci ⁺¹
what are your specs?
@PyLessons Před 2 měsíci ⁺¹
As I remember I used I7-7700k CPU and 1080TI GPU :)
@bomxacalaka2033 Před měsícem
@@PyLessons just to let you know, ive created a korean dataset with 50k images, and trained using your script, got an average CER: 0.063.
Also my code to create the dataset was done on a rush so the dataset looks horrible and took 15 hours to create it, but somehow its able to recognise most things i write down. Next ist to train the english model with korean and see what happens.
@PyLessons Před měsícem
Sounds great! Good job, keep doing this stuff :) if you have any suggestions for improvements, let me know!
@riswangp Před rokem
i couldn't install the mltu
@PyLessons Před rokem
Why? what OS, what python version?
@riswangp Před rokem
@@PyLessons using python 3.9.16 using mac os, regarding to the tutorial i was tryna to install mltu=1.0.3
@riswangp Před rokem
@@PyLessons i tried to not specify the version but it’s still not working
@riswangp Před rokem
it says
note: This error originates from a subprocess, and is likely not a problem with pip.
error: metadata-generation-failed
@chatheriyanthangaraju2372 Před rokem ⁺¹
Why can't you use a colab file and share it with us... I will be very useful for us
@PyLessons Před rokem ⁺²
You are not learning while using colab, I found out its better practice to use pure python script, if you want to do step by step, experiment in debugger
@upppvr4280 Před 8 měsíci
I need your help. How we can contact you?
@PyLessons Před 8 měsíci
You can find my contacts on www.pylessons.com

Další v pořadí

Automatické přehrávání

YOLOv8: Real-Time Object Detection Simplified