Step-by-Step Handwritten Sentence Recognition with TensorFlow and CTC loss
Vložit
- čas přidán 29. 01. 2023
- Unlock the power of handwritten sentence recognition with TensorFlow and CTC loss. From digitizing notes to transcribing historical documents and automating exam grading.
This tutorial will teach you how to use TensorFlow and CTC loss to master Handwritten Sentence Recognition. This challenging task involves interpreting text written in handwriting and has various applications, such as converting handwritten notes into digital text, transcribing historical documents, and automating the grading of exams. One of the critical challenges in Handwritten Sentence Recognition is handwriting variability, which makes it difficult for a machine-learning model to recognize handwritten text accurately. With this tutorial, you'll be able to address this challenge and use your model to recognize handwritten text with high accuracy. You'll learn how to use CTC loss to handle sequence data, such as text, and how to train your model to recognize handwritten text even with different input and output sequence lengths. Don't miss out on this opportunity to become an expert in Handwritten Sentence Recognition!
Text Version Tutorial: pylessons.com/handwritten-sen...
GitHub: github.com/pythonlessons/mltu...
pypi: pypi.org/project/mltu/
#machinelearning #python #tensorflow #opencv #ocr
Awesome informational video.
Thanks for sharing such a valuable content..
My pleasure!
hello, I have a question, how to continue the training of the model? I don't want to restart training the machine from scratch all over again,
Can you suggest to some steps to extract information from a "BANK CHECK" like Payee name, Amount in words, Amount in digits, date, MICR Code etc .
Hello Sir, thanks for sharing this video and knowledge request your suggestion on the software that identifies the user handwriting style like (Cursive handwriting, Lucida etc) and suggest the improvements in Handwriting if there any mistakes in the handwriting styles. for example user writes word "Apple" in Cursive handwriting but if letter "e" is more squeezed, the software should identify the letter "e" and suggest "e" should be properly written... please suggest if you have any software or code... Thanks a lot once again.
Will this work on JUPITER NOTEBOOK PYTHON...PLS HELP..THANKS....
we are not able to download the dataset
Also how to do labelling for my custom data?
NEED HELP! I am making a website where user can upload a pdf but I want that pdf to upload only if that pdf has images of only HANDWRITTEN text. Thank you for reading.
How to give external images for this model sir, can you please say me
Can we get the tensor board graphs that you show in the end, in google colab?
I'll check if I still have them
In which folder we want to store train model pls tell me it is showing error
Awesome sir, could you please make another video on how to improve the accuracy rate of the same model?
Hey, way more data and try to improove model, thats all, i am not sure if its worth creating another video where everything would be the same
Did you found the way bro ??
Awesome tutorial.
I have one question, can i use it on image which contains more than one sentence? Like 2 or 3 sentences.
Thanks! It would be way harder to train model on couple of sentences, but it would work. Haven't tried
great tutorial! - but i have a question , if i download the giant IAM dataset -> train my model -> load my model / script into a streamlit web app , can I delete the IAM database after training my model locally or do I need to keep it so that my model continues to run? Im working on an OCR app that copies text from pdfs and converts it into string - thanks!
Its not that giant comparably. But answering your question, as long as you complete model training you don't need this dataset. it doesn't require dataset to make predictions ;)
@@PyLessons thank you! cheers 🍻
A neural network is not a database, it does not refer back to anything. As long as you're training the model you will need the dataset. Once done the information is stored in the neural network in the form of weights.
I have a question. After training, how can I use my own data input to predict/recognize. Where should I change, in which format & how to give the that input for prediction?
Hey, example is given here:
github.com/pythonlessons/mltu/blob/main/Tutorials/04_sentence_recognition/inferenceModel.py
Can we convert it into tflite for android implementation
I haven't tried, but yes, I don't know why it shouldn't be possible
After training, how can I use my own data input to predict/recognize. Where should I change, in which format & how to give the that input for prediction? How to give external images for this model sir, can you please say me!
There is inference script in tutorial, analyse it
Hi there, how i make propject where i insert handwritten pdf and get a pdf with text converted .
It is way harder than you think, and I suppose if you ask, you haven't done any research yet
Not able to download dataset from the link you mentioned, can you help me ?
You may be not able to access dataset website from your location, try to use vpn to access dataset
Did start with the loss inf? and then after the 20th epoch started to learn?
Yes, this is because of small dataset or weak neural network architecture. But it works, and people can move further with this :)
How to create handwritten image dataset for regional language
Get or create annotated dataset for that language :)
How to print val _accuracy for each epoch if i add in the metrics it is giving error
How you try to do it? if you try to add metric as metrics=['accuracy'] it may not work because of CTC loss, you can try metrics=['sparse_categorical_accuracy'], I haven't tried but it should work
Awesome tutorials, hi sir, could you mind to explain how to calculate the accuracy of the model we have built🙏 and if the real text written on the picture is "i buy clothes" and the result of predicted text is "i buy clothed", how is the accuracy? Will it be totally not accurate or it will get 90% or more but not 100% accuracy?🙏🙏
Hey, that's why we use CER (Character Error Rate) and WER (Word Error Rate) metrics to get these results, but yes, more familiar sentence is lower the scores will be.
@@PyLessons HEllo sir can you provide the dataset .IT seems their website is defunct so i cannot register a new email hence cant access the data.
i can't access the website to download datasets, can you give me a folder on gg drivers or smth so i can download from that. Thanks all guys.
Hey u got the folder??
can you share your datasets sir? because not able to download dataset from the link you mentioned
You may be not able to access dataset website from your location, try to use vpn to access dataset
@@PyLessons It does not send a verification link to our email
Hi,
This is Awesome,
can I use this for other languages
Hey, Thanks! Yes you can use this for other languages, this tutorial is just an example
Hi
I am facing the issue regarding that IAM dataset . How much time it take to get the verification email
No idea, for me it works fine
I'm able to get the prediction. How do I improve the prediction results?
Bro can you guide us
Could i get ur phone no or insta id i need ur help
@@-HarshalMali Yes
Hi, I'm implementing ctc on an attention encoder, and I want to jointly decode it with an attention decoder, after the encoder I added an output layer(train that with ctc) same thing with decoder(but train with cross entropy). but obviously the output shapes are diff, I want to do a linear combination of the outputs of the two, ctc gives me (batch,maxframelength, vocab size) and decoder gives me (batch, transcriptlength, vocab size), is there some step I'm missing? I can't figure it out. Great video btw😊
Hey, you should check my tutorial about transformers. I haven't tried to do what you do but everything sound logically. I think your decoder output is wrong (but not sure) - hard to say without looking at the code. Usually you will run forward pass on encoder one time, and on decoder side you'll need to iterate untill the end of sentence
Hi
Actually I am facing the issue regarding the IAM dataset how much time it takes to get the verification email can you plz tell
Hi sir The tutorial was awsome but If the image is a one page handwritten text then what shoud we do...Please let me know......
With computer vision techniques, seperate each line and do recognition on these lines
How to run this code?
I gave all step by step details about this
Hello,
Thank you for your awesome work !
I'm a French developer and I actually transcribe some family papers dated between XIIIth and XIXth centuries (French and Latin).
Could you give me some clues for contructing datasets suited to work with your code ?
I'm beginning my investigations and I'm sure I'll eventually find a way to achieve the task on my own but your help could save me some time as I'm new to TensorFlow and PyTorch.
You are also mentioning that you your model is training in 1h30. Could you share the technical specs of your harware so I can compare with mine ?
Bravo once again. I hope to hear from you.
Yes, I tried to write code so it would work out of the box. As I remember, I trained it with GTX 1080TI gpu
@@PyLessons Thank you for your answer. Next gen GPU will allow better training time :)
can we convert these recognize text in to voidable? using raspberry?
can you give more details? voidable?
@@PyLessons I want to make this for blind then I need voice output
sir please tell me whether i can use this code to convert a whole page into textual format
You can't use it to convert whole page
@@PyLessons Can you make tutorial to do for whole page Please coz I'm doing this for my major project and not getting code anywhere
instale mltu ,pero no me reconoce mltu.losses ni callbacks y tampoco metrics alguna respuesta o les paso lo mismo estoy con python 3.11
?
Hi bro I have a question there is no such library as mltu.tensorflow. How can i resolve this ??
install it
@@PyLessonsI'm really sorry but I have installed the mltu library and tensorflow 2.10, all functions are there except mltu.tensorflow and mltu.utils and mltu.annotations can u tell me where to get those files?
I registered in the site(IAM Dataset)but i am not able to access the database.There are login issues
not sure, for me it works fine
@@PyLessons could you make a drive or something from where we can download. because even i cant access
Sir, i am getting< No module named 'onnx' >error. What i have to do?
Pip install onnx
Sir, I want to make correction system so please tell me how to do.
As of my knowledge I think these are steps involved in it, so plz tell me and suggest me and help me.
1. Capturing the real-time paper or scanning.
2. Sentence recognisation
3. Using NLP to get keywords from recognised sentences
4. Creating a dataset that includes keywords
5. Tokenizing and comparing the recognized data with dataset.
6. Allocating marks
If these are wrong please can you tell me?
And I had downloaded ascii.tgz and sentences.tgz files from that website but I can't getting extracting files how sir?
I finished training, but where foes the output get showed. Pls help .
you mean where model is saved?
please how much time training take??
error showing Exception: Model path (Models/04_sentence_recognition\202303182217\model.onnx) does not exist
You need to download a model from the link in my text version tutorial or train your own model
@@PyLessons Thank you sir,i got the output😀
I installed the mltu correctly but getting a "No module named 'mltu.utils'". Rest of the mltu modules were imported without any issues. Could you please help me how to resolve this?
What version of mltu?
@@PyLessons version 0.1.5
Thanks, there was a bug, and no one mentioned it... I released 0.1.7 version, try it now
@@PyLessons Yes this version is working. Thanks a lot!!☺
Could u tell comand how to save the modal
Yes
Hey, its like a standard way, right now model is saves by callbacks
Hello Sir,
I am getting the error FileNotFoundError: [Errno 2] No such file or directory: 'Models/04_sentence_recognition/202301131202/configs.yaml'
How Do I resolve this?
Can you please help. Thankyou
download the model file together with configs?
@@PyLessons actually its not with the downloads thats the same thing that im facing too
Awesome sir, i got this error when i try to train the model ...... TypeError: Input 'y' of 'Less' Op has type int64 that does not match type int32 of argument 'x'.
Install mltu the right version, it may be the issue
@@PyLessons same error, i used mltu 0.1.5 version
@@omarzain3292 thanks, I will check, you may try 0.1.6 version
Hi I am not able to access the website, does it require a vpn
NVM, just found out you need a VPN to access from India
Great! Because I can access it from my location I didn't knew that you can't access it from other locations
Hey I could not download it, could you please send it to me??
drive.google.com/drive/folders/13-U__hphtd1Wc5F7UjmIauT4U4nDZ_yY
You could upload in the drive link
hi i have a problem in mltu
"
ERROR: Cannot install mltu==0.1.3, mltu==0.1.4, mltu==0.1.5, mltu==0.1.6, mltu==0.1.7, mltu==1.0.0, mltu==1.0.1, mltu==1.0.2, mltu==1.0.3, mltu==1.0.4, mltu==1.0.5, mltu==1.0.6, mltu==1.0.7 and mltu==1.0.8 because these package versions have conflicting dependencies.
"
What's the solution
Thats strange, what OS you use and what python version?
@@PyLessons I fixed it thanks ,
but it doesn't work it needs datasets
Hi sir I'm really sorry but I have installed the mltu library and tensorflow 2.10, all functions are there except mltu.tensorflow and mltu.utils and mltu.annotations can u tell me where to get those files?
what mltu version you installed?
@@PyLessons mltu 0.1.6
install newest version and follow tutorial code that is on github
Thanks sir it worked but how can I lower the training speed as it is taking 2 hours to train 1 epoch if I want to do 1000 then it will take more can u help me out?
@@nahushs make sure to train on GPU, and early stopping will work, it wont train for 1000 epochs
1000 epoches are essntial for traing? and what amout of time do i wait to train the mode sir?
Hey Sir, I think you haven't watched complete video or haven't read my text version tutorial. For this you use validation dataset, and you stop training when your model achieves best point within this validation dataset
how to solve error : artefact not found
What OS you use?
@@PyLessons Windows 11 Home Single Language
@@sohambhole4288 stow package doesnt work with win11, I'll make a fix in a week, but you can replace stow package code with os.path package if you cant wait
@@PyLessons ohk thanks
@@PyLessons Sir please make a detailed video for this error as lots of folks getting the same error
Hi, I am trying to train my model on my database according to the tutorial and sometimes the training takes quite a long time so I wanted to load the model saved by callback with load_model("{path_to_model}/model.h5") and continue training where I left off. Unfortunately, I get an Unknown loss function error: CTCloss, which I tried to solve using the custom_objects parameter, but it caused another error that I couldn't solve: CTCloss.__init__() got an unexpected keyword argument 'reduction'. Then I tried to do it by saving the file in .tf format and it caused an error related to the metrics and also after using custom_objects and passing these metrics the error looped and it was related to the metrics arguments (which I entered). So Is it possible to somehow load a saved model while training is interrupted and continue training it so that it stays in accordance with the tutorial? (For example, I have epoch 53 /1000 and I see that the best value yet was saved to the model.h5 file at 52 epoch so I stop learning and then I want to load the saved model at epoch 52 and continue from there)
Open issue on github, will be easier to solve this
How to solve Value error: not enough value to unpack
need more details
@@PyLessons File "C:\Users\Soham\AppData\Local\Programs\Python\Python310\lib\site-packages\mltu\dataProvider.py", line 215, in __getitem__
batch_data, batch_annotations = zip(*[augmentor(data, annotation) for data, annotation in zip(batch_data, batch_annotations)])
ValueError: not enough values to unpack (expected 2, got 0)
You changed something in code? Because you are not giving any data to dataProvider "(expected 2, got 0)"
@@PyLessons No I have just changed stow to os.path package
ok, I see its not enough to do this change, tomorrow I'll look at it and post you a link to fixed code
when i try to train the model i face this error
stow.exceptioncould s.ArtefactNotFound: Couldn't locate artefact /Users/aliha/AppData/Local/Temp/tmpg50_k80k
could u please help me with it
what is your OS?
@@PyLessons
thanks for replying
it's win11
win11 has problems with stow package, in future versions I'll try to fix these issues
@@PyLessons please fix this issue as soon as possible sir we have deadlines