A newsletter reader asked the question: “How can i make a PDF file searchable?” I thought others might have the same question, so I will reply here in the newsletter so that everyone can benefit. Also, anyone with additional methods is invited to post comments at the end of this article.
A searchable PDF is a PDF file that lets you both search for keywords in the text and use copy/paste to extract text from the PDF. Unfortunately, many PDF files are not searchable. Instead, they are simple images of an original document.
In short, a PDF file usually can be made searchable ONLY when it is created. There are some exceptions, however. I will describe some exceptions at the end of this article.
If the file is searchable, the reader typically can press Control-F on Windows or Linux or Command-F on Macintosh. A small search window will appear, and the user can enter the word or phrase that he or she wishes to find. If the PDF creator did not make the file searchable, the search by using Control-F or Command-F will never find anything.
The normal method to make the file searchable at the time of creation is to use Adobe Acrobat Pro or Adobe Acrobat Pro DC (but not the simpler and free Adobe Acrobat Reader) and following the steps described in an article by Tammy Clevenger on the Techwalla web site at: http://bit.ly/2rsJriI.
Once completed and saved, the PDF file will be searchable.
If you have downloaded a non-searchable PDF file created by someone else, you can try any of the suggested methods at the end of this article.
Of course, Adobe isn’t the only vendor to make programs that create PDF files. There are other PDF creation programs available from a number of vendors, and each of them may or may not have the option to create searchable PDF files. The simpler (usually free or cheap) PDF creation programs typically do not have an option for creating searchable PDFs.
In addition, there are a number of programs that will convert any PDF file into other formats, such as a conversion to Microsoft Word’s .DOC or .DOCX formats or Excel spreadsheets or other formats as well. Once converted, you should be able to search for text by using the newly-created file along with Word, Excel, or any other compatible programs. However, the conversion programs typically will not create searchable PDF files.
There are a few programs or services that are “backdoor tricks” that will convert an existing, non-searchable PDF file into a searchable PDF without using Adobe Acrobat Pro or Adobe Acrobat Pro DC:
- My favorite method of converting a PDF into a searchable version is to use the FREE cloud-based Searchable PDFs application at http://www.searchablepdfs.org. However, it only is free for PDF documents of ten pages or less and also the file must be less than 5 megabytes in size. For larger documents, you probably will need to pay for a service or software with the needed capabilities.
DISCLAIMER: I haven’t tested the following methods, so I am not aware of the various advantages and disadvantages of each method. However, each of these claims to be able to convert non-searchable PDF files to their searchable equivalents.
- Create searchable PDF files with a ScanSnap scanner as described at: http://scansnapcommunity.com/tips-tricks/429-how-to-create-searchable-pdf-files-with-scansnap.
- Use FileTime, a free or inexpensive program described at: https://filetime.com.
- Windows users can purchase PrimeOCR, which is described at: http://www.primerecognition.com/prime_ocr.htm. PrimeOCR costs between $4,600 to $8,000 per PC. No, that is not a typo error. That is four thousand U.S. dollars or more. PrimeOCR obviously is aimed at corporations that need to convert thousands of such documents and therefore can justify the high price.
- Windows users can purchase a $299 program called image to PDF Converter as described at: http://www.verypdf.com/app/image-to-pdf-ocr-converter.
Several more methods of creating searchable PDF files may be found by starting at: https://duckduckgo.com/?q=create+searchable+pdf&t=hz&ia=web.