Extract Tables from PDF and convert to Excel sheet with Paddle OCR text detection and recognition.

Sdílet
Vložit
  • čas přidán 6. 09. 2024

Komentáře • 261

  • @christianrazvan
    @christianrazvan Před rokem +1

    Well that is a very simple and readable table, it's easy enough to do it with basic if logic....but try a no border , very near to border content , on a scanned image of a table

  • @JujutsuMan
    @JujutsuMan Před 9 měsíci +2

    Impressive content for Deep Learning OCR! Many thanks!

  • @aerocyropyros
    @aerocyropyros Před měsícem

    Thanks lad, it gave me some ideas how to apply it paddleOCR in my research mate

  • @user-tk3cd1ly7o
    @user-tk3cd1ly7o Před 3 měsíci

    impressive, struggling right now for my little side project using ocr, u helped a lot man, appreciate it

    • @AmanChauhan-hr1wh
      @AmanChauhan-hr1wh Před 3 měsíci

      hii does this notebook working for you actually for me it's not can u please help

    • @user-tk3cd1ly7o
      @user-tk3cd1ly7o Před 3 měsíci

      @@AmanChauhan-hr1wh well i just use his method, not totally copy from him, the result i implemented by myself is not really 100% correct so i end it up by using the azure api, it's really 100% correct and the speed of processing is so fast as well

  • @ajithn7336
    @ajithn7336 Před 5 měsíci +1

    Hello neuralearn, thanks for your great tutorial.
    Could you please proivide notebook access

  • @malakkhiari1419
    @malakkhiari1419 Před rokem +1

    how i can fix this error "ImportError: libcudart.so.10.2: cannot open shared object file: No such file or directory" ?
    caused by the line of code "import layoutparser as lp"

  • @vishaldas6346
    @vishaldas6346 Před 11 měsíci +1

    Hi, you have done a phenomenol job, by explaining PaddleOCR in detail. Can you please let me know if we can do the training of PaddleOCR on custom datasets for extracting data from tables of different length in pdfs or images.

  • @brmaaouia9715
    @brmaaouia9715 Před 6 měsíci +1

    What if it does detect the table as table but as figure or text ?

  • @puneetbansal8567
    @puneetbansal8567 Před rokem +2

    Hi, Neuralearn, Thanks for creating great tutorial. Its very useful. Can you please provide notebook access ?

  • @_keto444
    @_keto444 Před měsícem

    40:40 i got the following error:
    TypeError: int() argument must be a string, a bytes-like object or a real number, not 'list'
    how can i solve it?

  • @user-mt8vk5sh2i
    @user-mt8vk5sh2i Před 5 měsíci

    I want convert CSV file into Json file, { field 1: {col1:text, col2:text, col3:text},{field2:{col1:text,col2:text, col3:text} in this format. Can you please help me to create this Json file. Thank You

  • @toto2321
    @toto2321 Před rokem

    thank you man the best who explain what it is actually happening thank you so much

  • @mohamedmagdy3872
    @mohamedmagdy3872 Před rokem

    brilliant work!!, I would like to thank you for giving me access to notebook.
    keep going broo 💙💙

    • @neuralearn
      @neuralearn  Před rokem +1

      My Pleasure :)
      Feel free to check out on our other videos

  • @emailvarun
    @emailvarun Před rokem +1

    Hi Thank you for this, can youj please help me with the notebook access please, also can you please help me understand will I be able to cover most of the table formats through this?

  • @leonc5510
    @leonc5510 Před rokem

    Thank you for the tutorial, I have requested the notebook access

  • @Jean-nf1yh
    @Jean-nf1yh Před 4 měsíci

    Broo, this is awesome, thank you very much!!!

  • @ShivamSingh-sm2oy
    @ShivamSingh-sm2oy Před 5 měsíci

    Hey, Thanks for the wonderful tutorial man! can you please provide access to the notebook please.

  • @alirezaghasrimanesh2431
    @alirezaghasrimanesh2431 Před 8 měsíci

    Thanks for yor great toturial!!!!

  • @edwinjoe6044
    @edwinjoe6044 Před rokem +1

    Hi @Neuralearn. I am getting this "ValueError: (InvalidArgument) Device id must be less than GPU count, but received id is: 0. GPU count is: 0.
    [Hint: Expected id < GetGPUDeviceCount(), but received id:0 >= GetGPUDeviceCount():0.] (at /paddle/paddle/phi/backends/gpu/cuda/cuda_info.cc:242)"
    I am having intel® hd graphics 2500 graphics card so I can't run the project in my system how to run the program in my system.

    • @neuralearn
      @neuralearn  Před rokem +1

      Hello my dear Joe, here is a notebook which works for cpu runtime: colab.research.google.com/drive/1vZHrahaaubhWMz83jlPuvA1na_v98fUP

    • @edwinjoe6044
      @edwinjoe6044 Před rokem

      @@neuralearn Thank you bro Thanks for the support 🤗

  • @quocvuong6752
    @quocvuong6752 Před rokem +1

    Thank you so much, I really appreciate the informative video. Could you please allow access to google collab? It would be super helpful.

    • @neuralearn
      @neuralearn  Před rokem

      Hello my dear Quốc, please check your mail :)

  • @PurushothamReddy-ff6vp
    @PurushothamReddy-ff6vp Před 4 měsíci

    Hello, I'm facing trouble when there are multiple lines within the same row, it is considering them as new rows.. how do i fix this?. Thank you!

  • @robertdolovcak9860
    @robertdolovcak9860 Před 9 měsíci

    This is first that I hear about PaddleOCR. Seems very good tool. I really appreciate the work you have done and would also want to try this. Can you please allow access to the google collab code for this?

    • @neuralearn
      @neuralearn  Před 9 měsíci

      Hello my dear Robert
      Please check your mail

  • @niroshiniedayaratne4066

    Thank you very much for this! Very insightful!

  • @henrydo9731
    @henrydo9731 Před 10 měsíci

    I have a question that if I have a table but it's in 2 pages (half of it is in 1st page and the other is in 2nd page), how could I solve this problem

  • @siddharthpatel2193
    @siddharthpatel2193 Před rokem

    Can I get code? I followed video and wrote code and everything is working but due to some issue, out_array at end is same value.
    Update: Solved
    Thanks, this is best tutorial on this topic (saying this after going through countless tutorials, research papers and blogs in past 3 months).

  • @snehitdua153
    @snehitdua153 Před rokem

    I'm getting error in loading the model
    ValueError: (InvalidArgument) Device id must be less than GPU count, but received id is: 0. GPU count is: 0.
    [Hint: Expected id < GetGPUDeviceCount(), but received id:0 >= GetGPUDeviceCount():0.] (at /paddle/paddle/phi/backends/gpu/cuda/cuda_info.cc:242)

  • @manoubilahbib2572
    @manoubilahbib2572 Před měsícem

    That was awesome, thanks

  • @bharattyagi1405
    @bharattyagi1405 Před měsícem

    Hi, could you please provide access to the collab notebook.

  • @RohitSharma-to7yy
    @RohitSharma-to7yy Před rokem

    Hi. The content is very impressive. Would love to see the notebook and add upon this to create table in google docs instead. Please share the notebook

  • @moez.mazhar
    @moez.mazhar Před rokem +1

    Hi, I've followed your procedure as is but I'm getting "ValueError: Can't convert Python sequence with mixed types to Tensor." on the Non-Max Suppression portion. Can you tell me what might be causing that please?

  • @kenjeroldarellano4617

    Hi, Neuralearn, Thanks for creating a very useful tutorial. Can you please provide notebook access for my study?

  • @pokopiko429
    @pokopiko429 Před rokem

    Congrats, one of the best videos I've seen on this topic! Could you please grant me access to the Google Collab?

    • @neuralearn
      @neuralearn  Před rokem

      Please after requesting access, check your mail inbox or spam

  • @dinaharan0213
    @dinaharan0213 Před rokem +1

    Hi,i am installed paddlepaddle instead of paddlepaddle-gpu bcoz i dont have gpu in my local system. I getting "AttributeError: module 'numpy' has no attribute 'int'".
    Is it possible to run this project in local system without gpu.

    • @edwinjoe6044
      @edwinjoe6044 Před rokem

      I facing this error too...☹

    • @neuralearn
      @neuralearn  Před rokem +1

      Hello my dear Dinaharan, here is a notebook which works for cpu runtime: colab.research.google.com/drive/1vZHrahaaubhWMz83jlPuvA1na_v98fUP

    • @dinaharan0213
      @dinaharan0213 Před rokem

      Hi, I am very happy to get your rply and wonder of your help.I am glad to have youtuber like you. I really liked your efforts for your subscribers. Thank you very much. 🤗😇👏👏👏

    • @neuralearn
      @neuralearn  Před rokem +1

      My pleasure :)

  • @khushibaghel220
    @khushibaghel220 Před 8 měsíci

    Hey! I want to try out your tutorial. Could you please give access of your notebook

  • @aishwaryachowta6598
    @aishwaryachowta6598 Před rokem

    Thank you for the tutorial !!!

  • @kikigaming4595
    @kikigaming4595 Před rokem

    how to intall layout parser ? from the github now it doesn't have any file such as layout parser

  • @malakkhiari1419
    @malakkhiari1419 Před rokem +1

    How to get access to your notebook?

  • @EarningsApps
    @EarningsApps Před rokem

    pls give access to notebook ...great and informative tutorial !!

  • @statosys
    @statosys Před 6 měsíci

    Request access for colab notebook, thank you so much.

  • @rrrsranjith
    @rrrsranjith Před rokem

    Excellent video 🔥

  • @aishwaryadinesh7641
    @aishwaryadinesh7641 Před rokem

    Hi, I'm getting this error - (External) CUDA error(100), no CUDA-capable device is detected.
    [Hint: 'cudaErrorNoDevice'. This indicates that no CUDA-capable devices were detected by the installed CUDA driver. ] (at /paddle/paddle/phi/backends/gpu/cuda/cuda_info.cc:66).
    Can you help me out w this please?

  • @francescodimartino8896
    @francescodimartino8896 Před 10 měsíci

    Amazing job! Could you please share with me the google Collab? 🙏

  • @cissemy
    @cissemy Před rokem

    Great.
    Is it possible to use this model for matrix recognition ? how many rows and columns, elements of matrix ?

  • @pratikmore4044
    @pratikmore4044 Před rokem

    I am getting the following error and not sure how can I resolve this:
    Error: Can not import paddle core while this file exists: /usr/local/lib/python3.10/dist-packages/paddle/fluid/libpaddle.so
    Tried reinstalling paddlepaddle but that didn't work.

  • @user-iw5nd5pw4c
    @user-iw5nd5pw4c Před rokem

    Bro you're doing good work

    • @neuralearn
      @neuralearn  Před rokem

      Thanks for the kind words :)

    • @user-iw5nd5pw4c
      @user-iw5nd5pw4c Před rokem

      @@neuralearn I have a question I've pdf file which is 560 pages long and which has data that other libraries do convert into excel file but its like garbage. If I use this model i'll be able to convert it?

    • @neuralearn
      @neuralearn  Před rokem

      I think you should just go ahead and try. Its free :)

  • @mkthedawn
    @mkthedawn Před rokem

    Awesome 👍👍👍

  • @harshithprakash2433
    @harshithprakash2433 Před rokem

    Awesome video and interesting approach towards the problem , would you mind giving me access to that notebook..?

    • @neuralearn
      @neuralearn  Před rokem

      Hello my dear Harshith, please check your mail :)

  • @nealgilmore337
    @nealgilmore337 Před rokem

    Hello @neuralearn - love the demo! Can you provide me access to the Colab?

  • @user-kl2pc6td5h
    @user-kl2pc6td5h Před rokem

    Hello , Thanks for sharing this vedio, is this method will work for nested tables?

  • @balasubramaniyang6506

    Hi Nice Explanation, Can you provide access.It's very helpfull for us.

  • @anirbanghorai3699
    @anirbanghorai3699 Před 2 lety

    EXCELLENT!
    CAN YOU PLS POST A VIDEO ON Paddle OCR custom training (both detection +recognition)steps? I have my own data ..want to do a transfer learning

    • @neuralearn
      @neuralearn  Před 2 lety +2

      We are glad this was helpful :)
      We shall work on that and publish as soon as possible!

    • @anirbanghorai3699
      @anirbanghorai3699 Před 2 lety

      @@neuralearn glad you responded..waiting for the custom training video

  • @therafee
    @therafee Před 10 měsíci

    Why do we need to clone paddle repository at 15:57

  • @SiddheshBalashetwar
    @SiddheshBalashetwar Před rokem

    hello do you have any idea about packaging paddle ocr. Im trying to make a exe of my code but i keep facing errors. anyhelp would be helpful

  • @IsratjahanFateha9106
    @IsratjahanFateha9106 Před 9 měsíci

    Can I have the access of your Colab Notebook please? I have requested for the access yesterday

    • @neuralearn
      @neuralearn  Před 9 měsíci

      Hi,
      check your mail box or spam

  • @tommy-dz1yg
    @tommy-dz1yg Před 2 lety

    amazing vid!!!!

    • @neuralearn
      @neuralearn  Před 2 lety

      Glad you enjoyed it :)
      More on the way!!!

  • @ss_d25
    @ss_d25 Před 9 měsíci

    Hi, great video. Can you please provide access to this notebook? Thanks a lot in advance.

    • @neuralearn
      @neuralearn  Před 9 měsíci

      Hi,
      check your mail box or spam

  • @NileshKumar-ug1hl
    @NileshKumar-ug1hl Před rokem

    Hi, Can you please provide the notebook access?

  • @etarhunisuhaib2031
    @etarhunisuhaib2031 Před rokem

    Thanks for this video, let's say we have a page with free text and tables, once we have our tables, how can we extract the remaining text ? when im using parser it also extract the table text from the page. i want to use your approche for tables and i want to extract only the remaining text.

    • @Ankur-be7dz
      @Ankur-be7dz Před rokem

      for only extracting texts use pdfminer

    • @andrewlachance2062
      @andrewlachance2062 Před 8 měsíci

      just match the consecutive text from the table and parse the PDFs skipping over the text

  • @adillaanam4058
    @adillaanam4058 Před rokem

    hi! tysm for the video. would you pls allow access to the notebook? ty!!

  • @beratoren7627
    @beratoren7627 Před rokem

    This was an amazing tutorial ! I really want to try and further tweak this. Can you please grant me access to the Google Colab Code?

    • @neuralearn
      @neuralearn  Před rokem

      Hello please check your mail inbox or spam

  • @youssefmouknii5033
    @youssefmouknii5033 Před rokem

    Thank you so much for this video , Could you please allow access to google collab?

    • @neuralearn
      @neuralearn  Před rokem

      Hello my dear Youssef glad this video is helpful :)
      Please check your mail inbox or spam

  • @user-mb9uo1sg7s
    @user-mb9uo1sg7s Před rokem

    Very informative tutorial. I really appreciate the work you have done with this code. I also want to try this. Can you please allow access to the google collab code for this?

    • @neuralearn
      @neuralearn  Před rokem

      hello my dear Adil, Please check your mail :)

  • @ayushbansal999
    @ayushbansal999 Před rokem

    Hi, please could you provide me with the access to this colab notebook

    • @neuralearn
      @neuralearn  Před rokem

      Hello my dear Ayush,
      Please check your mail inbox or spam

  • @user-es3rp4lz6m
    @user-es3rp4lz6m Před rokem

    thank you for the explanation @Neuralearn , can u please provide me access to the colab ?

    • @neuralearn
      @neuralearn  Před rokem

      Please check your mail inbox or spam :)

  • @kanakjaiswal136
    @kanakjaiswal136 Před rokem

    It was excellently explained. I wanted to try it out but got many errors. So, Could you please grant me access to the google Colab code?

  • @snehitdua153
    @snehitdua153 Před rokem

    Hey, can you please provide the link for the pdf used in the video?
    Thanks

  • @ajaychinni3148
    @ajaychinni3148 Před 11 měsíci

    Please approve the access request for the Google Collab notebook. I am very interested in the code

  • @manojaar2008
    @manojaar2008 Před rokem

    Super!!!

  • @sameerdeshmukh1527
    @sameerdeshmukh1527 Před rokem

    Thank you. Please can you grant me access to notebook?

  • @frekin31
    @frekin31 Před rokem

    Thank you so much for your tutorial! Can you please grant me access to the Google Colab Code?

    • @neuralearn
      @neuralearn  Před rokem

      Hello,
      Please check your mail inbox or spam :)

  • @kibtiachowdhury6011
    @kibtiachowdhury6011 Před rokem

    Hi, I want to get only paragraph text without any figure and table from any type pdf. How can I solve this?

    • @neuralearn
      @neuralearn  Před rokem

      You can pick text by changing [if l.type == 'Table':] ----to --> [if l.type == 'Text:]

  • @revanthkumar3406
    @revanthkumar3406 Před rokem

    Hey, Really Great Video ❤, can u provide access to notebook

    • @neuralearn
      @neuralearn  Před rokem +1

      Hello my dear Kumar,
      Please check your mail inbox or spam

  • @xavier6649
    @xavier6649 Před rokem

    Hey Great Work , can you give access to your Colab Drive ?
    Thanks

  • @KartikSharma-hd7rd
    @KartikSharma-hd7rd Před rokem

    Excellent tutorial, can you please access grant for google colab notebook :)

  • @user-jc2ot4tk7y
    @user-jc2ot4tk7y Před 8 měsíci

    Very informative video. Can you please share the code with me ? It would be very helpful.

  • @chafikhermouche5136
    @chafikhermouche5136 Před rokem

    Hello, thank you for the tutorial !! Can I get the code please ??

  • @therafee
    @therafee Před 10 měsíci

    @neuralearn hello could you indicate me where is test.pdf file?? I have access to de notebook but it throws error
    I got:
    PDFPageCountError: Unable to get page count.
    I/O Error: Couldn't open file '/content/bahdanau attention.pdf': No such file or directory

  • @snehalvats382
    @snehalvats382 Před rokem

    Hey there! it is a wonderful video on how to work with ocr and table. i have requested for notebook access could you please provide me with the access? thank you once again for this tutorial

    • @neuralearn
      @neuralearn  Před rokem +1

      hello my dear Snehal, please check your mail :)

    • @snehalvats382
      @snehalvats382 Před rokem

      @@neuralearn dear team. I have not yet received the confirmation. It's the same email as the one I'm replying with.

  • @dishaparmar2609
    @dishaparmar2609 Před 8 měsíci

    amazing video..! very helpful ..! could you please provide source code?

  • @stevevu8654
    @stevevu8654 Před rokem

    it's fascinating. would you mind giving me the access to the colab code?

    • @neuralearn
      @neuralearn  Před rokem

      Hello my dear Steve. Please check your mail :)

  • @walkwithus6536
    @walkwithus6536 Před rokem

    Hi, if we have multiple tables (huge tables) then this method will work?

    • @neuralearn
      @neuralearn  Před rokem

      Yes, it should work. I think it's best to try it for yourself :)

  • @PrashantKumar-nb5ig
    @PrashantKumar-nb5ig Před rokem

    May be adding download links would have been more helpful,

  • @ramyas9837
    @ramyas9837 Před 8 měsíci

    which python version ?

  • @youseffarouk6189
    @youseffarouk6189 Před rokem

    how can i use paddle ocr for receipts ?

  • @sudeshkumar5600
    @sudeshkumar5600 Před rokem

    Hi, It is very interesting and to me. I really want to try this out. Could you please grant me access to the google colab code?

  • @amilaviraj1014
    @amilaviraj1014 Před rokem

    This is very informative tutorial! Could you please give me access to the Google Colab Code?

    • @neuralearn
      @neuralearn  Před rokem

      Hi my dear Amila
      Please check your mail inbox or spam :)

  • @AshishGupta-bd6hu
    @AshishGupta-bd6hu Před rokem

    Device ID must be less than GPU count, but received Id is:0 GPU count is :0, what does it mean when I run model.detect(image)

    • @AshishGupta-bd6hu
      @AshishGupta-bd6hu Před rokem

      I am running this on my local machine

    • @neuralearn
      @neuralearn  Před rokem

      Hello my dear Ashish, try out this notebook: colab.research.google.com/drive/1vZHrahaaubhWMz83jlPuvA1na_v98fUP

    • @AshishGupta-bd6hu
      @AshishGupta-bd6hu Před rokem

      @@neuralearn thanks for your response, I have sent you access request

  • @user-fd5qh2wt9d
    @user-fd5qh2wt9d Před 9 měsíci

    This tutorial is very helpful and informative . Can you share this code with me ?

    • @neuralearn
      @neuralearn  Před 9 měsíci

      Hi,
      check your mail box or spam

  • @rupakjha539
    @rupakjha539 Před 10 měsíci

    Hi Neuralearn team, can u please provide me the google colab code access

  • @josephebenezer8869
    @josephebenezer8869 Před rokem

    Hi, could you grant me access to the notebook please?

  • @AniketRana-iz5ms
    @AniketRana-iz5ms Před rokem

    hi , not able run this code in jupyter notebook , may u help to run this in local system, like what was the procedure for that

    • @neuralearn
      @neuralearn  Před rokem

      Hello my dear Rana, what issues did you face, while running the code locally?

    • @AniketRana-iz5ms
      @AniketRana-iz5ms Před rokem

      @@neuralearn it asking for gpu as a requirement, but i want to run this code on jupyter notebook with cpu

  • @Sara-fp1zw
    @Sara-fp1zw Před rokem

    can you please give me the access to notebook?

  • @hussainahmedsiddiqui3742

    Amazing tutorial, is this code available for use? I would appreciate it!

  • @waqaskhan2165
    @waqaskhan2165 Před rokem

    This was an amazing tutorial ! I really want to try and further tweak this. Can you please grant me access to the Google Colab Code? please reply , respect from karachi pakistan

    • @neuralearn
      @neuralearn  Před rokem

      Hello my dear Karachi, glad this was helpful.
      Please check your mail inbox or spam :)

  • @maniekm2808
    @maniekm2808 Před rokem

    Hi @neuralearn love the tutorial! Can you provide me access to the code?

    • @neuralearn
      @neuralearn  Před rokem

      Please after requesting access, check your mail inbox or spam

    • @maniekm2808
      @maniekm2808 Před rokem

      Thank You , chcecked but didn’t get nothing

  • @pmshadow
    @pmshadow Před rokem

    hi, excellent video!!! can you please give me access to the notebook? I have requested right now :) Thanks in advance!!

    • @neuralearn
      @neuralearn  Před rokem

      Done!

    • @pmshadow
      @pmshadow Před rokem

      @@neuralearn thank you very much!

    • @neuralearn
      @neuralearn  Před rokem

      You're welcome :)

    • @pmshadow
      @pmshadow Před rokem

      @@neuralearn I checked now, but it was not available for me, can you please double check? thanks!

    • @neuralearn
      @neuralearn  Před rokem

      Please check your inbox, we sent you a mail

  • @kinetic_kane9033
    @kinetic_kane9033 Před rokem

    Hello can I please get viewing access to the colab notebook?

    • @neuralearn
      @neuralearn  Před rokem

      hello Kane, please demand access and check your mail in 5 minutes

  • @googlecloudguru224
    @googlecloudguru224 Před rokem

    Please provide access to this notebook

  • @jyo8507
    @jyo8507 Před rokem

    What an amazing work. This will be a great tutorial for me in this area of work. I’m trying to access google colab notebook. Could you please grant me permission to access google colab notebook?

    • @neuralearn
      @neuralearn  Před rokem

      Glad it was helpful!
      Please check your mail

    • @jyo8507
      @jyo8507 Před rokem

      @@neuralearn I am not able to access the notebook. I requested for access.

    • @neuralearn
      @neuralearn  Před rokem

      Please check your mail again

    • @jyo8507
      @jyo8507 Před rokem

      Thanks a lot

    • @neuralearn
      @neuralearn  Před rokem

      You're welcome

  • @tapendrabaduwal7423
    @tapendrabaduwal7423 Před rokem

    Hello.... if their is unstructure table which is not in an order of n*m dimension cell.....then this method will work?

    • @neuralearn
      @neuralearn  Před rokem +1

      It depends on the table in question. Nonetheless, you can always modify this method to suit your specific table

    • @tapendrabaduwal7423
      @tapendrabaduwal7423 Před rokem

      @@neuralearn It is possible to work in all types of table at one short?

    • @neuralearn
      @neuralearn  Před rokem +1

      No, it's not possible!

    • @tapendrabaduwal7423
      @tapendrabaduwal7423 Před rokem

      @@neuralearn Thank you soo much

    • @neuralearn
      @neuralearn  Před rokem +1

      You're welcome :)