Optical Character Recognition From Beginner to Expert Using Python | Tesseract - Complete Tutorial

The Sineth

zhlédnutí 18 913

Přidat do
- Můj playlist
- Přehrát později
Sdílet

Sdílet

Vložit

Velikost videa:

Zobrazit ovladače přehrávání

Automatické přehrávání

Přehrát

čas přidán 26. 12. 2021
In this tutorial you will learn about both of concepts and practical implementations of optical character recognition in Python and Tesseract.
Tesseract is a most commonly used character recognition tool which was originally developed by the Google. Basically tesseract helps you to extract any text which is written in your digital images by using your command terminal or by using API implementations. Tesseract is not just an OCR which can extract written text from an image, it will help you to accomplish more advanced jobs which are related with character recognition operations. Some of them are get bounding estimates of recognized characters, convert images in to different output formats, use own customized configurations, get orientation and script detection reports, get tables of analysed verbose information. Tesseract supports Unicode encoding (UTF-8) and using tesseract you will be able to engage with more than 100 languages which is very helpful whenever you want to work with any other language rather than general english.
After watching and going through all the implementations regarding to this tutorial, you will end up with a guy who is well trained to work as an expert of optical character recognition !
Highly recommended for enthusiastic pythonists all over the world :)
Chapters
=========
1) Introduction to Tesseract and installation: 0:01:24
2) Introduction to Pytesseract and installation: 0:06:48
3) Configure tesseract path: 0:12:34
4) Check available languages: 0:14:17
5) Extract text from an image
5.1) Simple text extraction: 0:15:51
5.2) Specified language text extraction: 0:18:37
5.3) Multiple image text extraction: 0:32:05
5.4) Timeout text extraction: 0:35:53
6) Get and draw bounding boxes around characters: 0:40:19
7) Get report of verbose data: 0:48:22
8) Orientation and script detection: 0:51:49
9) Working with output formats
9.1) PDF: 0:57:02
9.2) HOCR: 0:59:21
9.3) XML: 1:00:40
10) Assigning Custom Configurations: 1:02:26
Download the project
====================
Google Drive : - drive.google.com/drive/folder...
References
==========
Tesseract: github.com/tesseract-ocr/tess...
Pytesseract: github.com/madmaze/pytesseract
Multiple config options: www.py4u.net/discuss/10850
Getting bounding box cordinates: stackoverflow.com/questions/2...
Social Media
============
Facebook: / sintax.tech.blog
Linkedin: / sineth-sankalpa-9aa4331ab
Subscribe 'The Sineth' and hit on the bell icon.
/ @thesineth
Thanks for watching ❤

Komentáře • 35

@TheSineth Před 2 lety ⁺⁵
1) Introduction to Tesseract and installation: 0:01:24
2) Introduction to Pytesseract and installation: 0:06:48
3) Configure tesseract path: 0:12:34
4) Check available languages: 0:14:17
5) Extract text from an image
5.1) Simple text extraction: 0:15:51
5.2) Specified language text extraction: 0:18:37
5.3) Multiple image text extraction: 0:32:05
5.4) Timeout text extraction: 0:35:53
6) Get and draw bounding boxes around characters: 0:40:19
7) Get report of verbose data: 0:48:22
8) Orientation and script detection: 0:51:49
9) Working with output formats
9.1) PDF: 0:57:02
9.2) HOCR: 0:59:21
9.3) XML: 1:00:40
10) Assigning Custom Configurations: 1:02:26
@washiniranasinghe3856 Před 2 lety ⁺¹
great job !!
❤️❤️
@pravallika527 Před rokem
Thank u so much I searched for this everywhere and I found urs very greatful😇😇
@shyamalikannangara8665 Před 2 lety
Great work sineth 💐
@tony-go-code Před rokem ⁺¹
great detail explanation
thank you for sharing
I will try this out.
@madhushankha..5379 Před 2 lety
Great work sinna ❤
@uminhtetoo Před rokem
Thank you so much,Sir.
@nethramandari3611 Před 2 lety
Great work
@HirenThakkar45 Před rokem
Bro nice it really help me a lot nice video✌🤟👍
@thiwankaarunalu9211 Před 2 lety
Nice work
@alexnieto5036 Před rokem
Sineth thank a lot for your video, i whish you continue doing more videos about OCR and especcially if it were possible about handwritting text .
@TheSineth Před rokem
Thank you very much! Interesting tutorials are being readied.
@chamathkaadihetti7902 Před 2 lety ⁺¹
🔥🔥
@winkfordmboma4560 Před rokem
This is nice work 👏 👌
@TheSineth Před rokem
Thank you!
@khushipitroda385 Před 2 lety ⁺²
Heyyy, thanks a lot, I had project regarding it, I searched everywhere for well defined begineer friendly video, yours was a great, can you do text extraction from video, it would be helpful :)
@TheSineth Před 2 lety ⁺¹
Definetly.
@ArunKumar-ov4oe Před 29 dny
Hi,
In "information about orientation and script detection" field,
I'm getting an error which says
"TesseractError: (1, 'Warning, detects only orientation with -l eng Error, OSD requires a model for the legacy engine')"
What can i do to run that block??
@Redstonedust-rc9nr Před rokem
Thx
@muazzamali7050 Před 8 měsíci
Hello Sir
It is Good informative video can you please let me you know can we extract text from passport or ids using this library
@HirenThakkar45 Před rokem
Make video on how to extract text if image is blured. ✌🤞
@nethramandari3611 Před 2 lety
❤️❤️❤️
@kaitlynlarocco6992 Před 2 lety
Hi Sineth! I'm trying to download tesseract on my Mac but I'm not sure which program to use. Could you help me out here?
@TheSineth Před 2 lety
Hi Kaitlyn !
Try out this one: gist.github.com/krissdap/1fb995bfd95c727eb7b4eb6d66ab7207
@sumedikanishadi3388 Před 2 lety
❤️‍🔥
@dilmithwalgampaya2234 Před 2 lety
♥️💪🔥
@kebabsharif9627 Před rokem
Does it work with a lited papers?
@TheSineth Před rokem
Any kind of an image is supported.
@winkfordmboma4560 Před rokem
Can you help with conversion from image to excel direct.. ??
@TheSineth Před rokem ⁺¹
Yes. Contact me: sinethsankalpabkss@gmail.com
@winkfordmboma4560 Před rokem
@@TheSineth emailed you
@user-pp1vi7wj1w Před 6 dny
Can tesseract read pdf?
@shreyajaiswal300 Před rokem
Hey sir, thanks for the project and tutorial but I am having some problems in code. I would like to contact you through mail, can you share your mail id please.
@TheSineth Před rokem
Thank you very much!
Contact: sinethsankalpabkss@gmail.com
@67-priyadharshini.b-bsec17 Před 5 měsíci
Which algorithm using this project?

Další v pořadí

Automatické přehrávání

ONE BREATH CHALLENGE! 👀😱😆 | Triple Charm #Shorts