Read PDF text and store in Excel using UiPath

Sdílet
Vložit
  • čas přidán 3. 08. 2024
  • In this tutorial video, we are using UiPath to read the data from a PDF invoice and store it in an excel sheet. After Extracting data from a PDF file into a text file, we are using Regular Expressions in order to find out the specific data.
    To learn more about Regular Expression, We would recommend you to check the following URL:
    regexone.com/
    In case you wanted to use the same PDF file follow the steps. You can download a PDF file from the following link:
    drive.google.com/file/d/16wF4...
    If you wanted to know more about Robotic Process Automation and how it can help you to automated your business process, please visit our website:
    www.aakarsoft.com
    #RPA #UiPath #regexp
  • Věda a technologie

Komentáře • 29

  • @elavarasik3199
    @elavarasik3199 Před rokem +1

    Thank you, well explained in simple terms. Easy to understand 👍🏻

  • @irinatutaeva7113
    @irinatutaeva7113 Před rokem +1

    Thank you, your explanation helped me a lot!!!🤩

  • @ahmedhelal920
    @ahmedhelal920 Před 6 dny

    i have many PDFs , only first row appeared other data not appeared in it's rows

  • @sebbsz
    @sebbsz Před 6 měsíci

    Hello! I get the next error while trying to Debug the process: "Add Data Row: Object reference not set to an instance of an object.". What should I do? I've put the same thing in "ArrayRow".

  • @rohitsharma9755
    @rohitsharma9755 Před 6 měsíci

    but if we have multiple files of pdf so can we use this method plss help me i have multiple files so i want to extrac where pdf have both structure and unstructured data'

  • @elavarasik3199
    @elavarasik3199 Před rokem

    Hi, thanks for the clear explanation. Can u explain how to extract multiple words for a single field. For eg, the address here contains 3 words(seperated by 2 spaces) using \w will bring up the first part alone.

    • @AakarsoftTechnologies
      @AakarsoftTechnologies  Před rokem

      Hi.. Please check the following regex. Hope this will help
      regexstorm.net/tester?p=%28%3f%3c%3dAddress%3a%5cs%29.%2b&i=Address%3a+B-16%2f102+Jaydeep+Apartment%0d%0aMira+Road+East

  • @user-fu2cl6pz9m
    @user-fu2cl6pz9m Před rokem

    Thanks for detail, it's very useful,and now i have case to run this in multiple file PDF in on folder, what should i do? thanks.

    • @AakarsoftTechnologies
      @AakarsoftTechnologies  Před rokem

      @myrpa 3: We are glad that you like it. To read all the PDF files you need to help of directory and Loop in UiPath. Following is the link we recommend you check:
      jd-bots.com/2021/04/30/get-all-files-in-a-directory-or-folder-using-uipath-studio/

    • @user-fu2cl6pz9m
      @user-fu2cl6pz9m Před rokem

      @@AakarsoftTechnologies thanks for this reference, but I'm still having trouble if I want to apply the case in this video for looping in the same folder PDF..

  • @TheDasni
    @TheDasni Před 2 lety +2

    how to extract multiple pdf and read the multiple text file into 1 excel .. kindly need your help thank you

    • @AakarsoftTechnologies
      @AakarsoftTechnologies  Před 2 lety

      Dear Ronald,
      To read multiple files you need to keep all your file in the same directory. After that, read all the files from the directory using Directory.GetFiles('Your Directory Path') function. Now, loop the whole process.
      I would recommend you to check Susana's answer in the following URL for more clarity:
      forum.uipath.com/t/read-all-pdf-files-from-folder/14799

  • @abdulhameedubayathula5453

    How to go next row in excel.. If we have more than one invoice??

    • @AakarsoftTechnologies
      @AakarsoftTechnologies  Před 2 lety +1

      We would recommend you first add all rows in DataTable by looping the AddDataRow activity. After that, pass the DataTable instance to WriteRange activity.

  • @SagarBR-yc8ju
    @SagarBR-yc8ju Před rokem

    I have a scenario where sometimes PDF Invoice do not carry any value for some fields, in that case i am getting an error message 'object reference not set to an instance'. I would like to have a solution from your end on how to mitigate this error either by getting output value as blank (In case of empty field) or output value (if value exist on the invoice).
    Example: If Purchase Order number field is blank on the invoice, then output should be blank in Excel

    • @AakarsoftTechnologies
      @AakarsoftTechnologies  Před rokem +1

      Hi,
      As per our understanding, you are getting this error because you have a NULL value in your variable. We would suggest you check the NULL value of the variable after extracting data using Regular Expression. If the variable has a NULL value, you assign a blank space(e.g. var="";) and try and write in a data table.

  • @ramshivareddy68
    @ramshivareddy68 Před rokem

    Instead of required output I'm getting as (System.Linq.Enumerable+d__97`1[System.Text.RegularExpressions.Match]).. How to rectify it?

    • @AakarsoftTechnologies
      @AakarsoftTechnologies  Před rokem

      Dear Ramshiva,
      Without knowing all the details, it would be difficult for us to figure out the problem. Still, we would recommend you to check the following post.
      forum.uipath.com/t/regex-output-in-matches-box/964/7

    • @ramshivareddy68
      @ramshivareddy68 Před rokem

      @@AakarsoftTechnologies I need to extract the data from the pdf to excel. I have completed the workflow design with no errors and finally after execution, when i open the excel file under the header(Name) its showing the output as (System.Linq.Enumerable+d__97`1[System.Text.RegularExpressions.Match]). How to rectify this? please suggest

    • @AakarsoftTechnologies
      @AakarsoftTechnologies  Před rokem

      Dear Ramshiva,
      Please visit the previously shared forum link. People have tried to answer and give some solutions. Hope you will get some solution.

    • @hemanthsibbala6697
      @hemanthsibbala6697 Před rokem +1

      Hi Ram...use variable(0) to get the text Ex: invoiceNumber(0). you might be missing (0). please check

  • @nivethar9178
    @nivethar9178 Před rokem

    What is should type in datarow

    • @AakarsoftTechnologies
      @AakarsoftTechnologies  Před rokem +1

      In this particular tutorial, you need not add anything to DataRow, as we are handling all the data in a string array and passing the same. Just for your information, you need to pass the DataRow object if it is available.

    • @nivethar9178
      @nivethar9178 Před rokem

      But without filling that datarow I can't run the file so plz tell what I should type in that particular column

    • @AakarsoftTechnologies
      @AakarsoftTechnologies  Před rokem

      Please check ArrayRow is passed in the correct format to DataTable, as DataRow is optional.

  • @sandhanamurali444
    @sandhanamurali444 Před 5 měsíci

    Same scenario but how to extract specific data if have 10 pdf file