How do I work with Adobe Acrobat PDF documents?
How do I work with Adobe Acrobat PDF documents?
With the exception of Microsoft Word and Excel, Blue Prism does not create VBOs for specific applications which are distributed with the product. However, here is some general advice about possible ways to access information contained in PDF documents. By their very nature, and as Adobe intended by design, PDF document data is self-contained and not rendered into individual font characters - that is what makes it so 'portable'!
NOTE: There is an in-depth guide document available on the Blue Prism Portal under the Learning section called "Interacting with PDF Documents". See https://portal.blueprism.com/learning/guides
There are tools available from Adobe which will render PDF documents as text and pictures. These tools do not appear in all versions of the product, though. Some options have a cost or may require additional downloads. For example, to edit a PDF document you may need the Adobe Acrobat Pro version. Without this, text and pictures will not be accessible with the Acrobat product.
It's not a one step process to read the PDF file in that way and do the conversion from what is essentially a 'picture' into a Blue Prism Collection. It's not like other document formats in that it needs manipulation using either AA Mode or Region Mode spying, depending on how you need to interact with the information.
Interacting with Acrobat functions
Using Win32 mode you will have access to only the windows frames that constitute the main areas of the document’s design. In order to have access to the PDF document’s functions it is necessary to switch to the Active Accessibility mode (AA) of spying:
Keystrokes can also be sent to the application if the main window is spied and focused.
Extracting text information using the MS Word VBO
If you want to work with the text or pictures within the document itself, then this is only accessible through other object models such as MS Word or another rich text editor (Word has the capability to save as PDF format).
One method to work with text in a PDF document is to convert the document to Word format, and manipulate it there. From Word you will be able to put the data into a Blue Prism Collection Data Type using the MS Word VBO object and its Get Highlighted Text function.
To reverse the process MS Word VBO has a function to convert a document back into PDF format: ExportPDF.
Extracting text information by sending keystrokes
To send keystrokes it is necessary to spy the main Acrobat screen using the Win32 spy mode. Once this element has been recognised then functions such asGlobal Send Keys and Global Send Key Events become available to send keystrokes to the main window and therefore to access menu systems and their available functions.
If the PDF is digitally produced you can open it and send keystrokes to mimic the functions “Select All” and then “Copy”.
You can inspect the resultant stream of text on the clipboard and parse out any data you need. The success of this depends on the structure of the PDF.
Extracting text information using Surface Automation or Optical Character Recognition (OCR)
If the structure doesn’t support other methods, or it is a scanned document that is rendered as an image rather than a converted text document, then you need to use Surface Automation in conjunction with the Read Text action using the Tesseract OCR functionality.
To do so you need to define the regions (Region Spy Mode) of the PDF in order to read the text from it. Even then the quality of the character recognition will be affected if the scan is not straight (regions are out of line) or of poor quality (low DPI). You can train Tesseract to interpret characters and recognise non-character areas, but this would need to be done for each PDF format.
Thank you for the blog..
ReplyDeleteHow Do I Work With Adobe Acrobat Pdf Documents? - Blue Prism For You >>>>> Download Now
Delete>>>>> Download Full
How Do I Work With Adobe Acrobat Pdf Documents? - Blue Prism For You >>>>> Download LINK
>>>>> Download Now
How Do I Work With Adobe Acrobat Pdf Documents? - Blue Prism For You >>>>> Download Full
>>>>> Download LINK kt
This is such a great resource that you are providing and you give it away for free. I love seeing blog that understand the value. Im glad to have found this post as its such an interesting one! I am always on the lookout for quality posts and articles so i suppose im lucky to have found this! I hope you will be adding more in the future… for any kind of Adobe support you can call us 0800-090-3240 or visit Adobe phone number uk.
ReplyDeleteHi
ReplyDeleteThanks for sharing nice informative blogs.its a really helpful post for Adobe users.if your readers are using McAfee Antivirus and looking for experts advise . Call McAfee toll free number or visit www.mcafee.com/activate
Well, this blog post helps me a lot to solve my problems. I must say that the writer has a deep knowledge of framing sentences. If you have any issue against Adobe, then visit here: Adobe Support Number UK
ReplyDeleteCan't believe Blueprism requires multiple steps to achieve a simple PDF to Text File conversion. This can be achieved using a single stage/activity in UiPath.
ReplyDeleteThanks for every other informative site. The place else may just I get that kind of information written in such an ideal means? I have a venture that I’m just now operating on, and I have been on the look out for such information. this
ReplyDeleteHow Do I Work With Adobe Acrobat Pdf Documents? - Blue Prism For You >>>>> Download Now
ReplyDelete>>>>> Download Full
How Do I Work With Adobe Acrobat Pdf Documents? - Blue Prism For You >>>>> Download LINK
>>>>> Download Now
How Do I Work With Adobe Acrobat Pdf Documents? - Blue Prism For You >>>>> Download Full
>>>>> Download LINK 5L