Free and Nearly Free Online OCR Tools

OCR_logoEveryone loves free. That is especially true for OCR (Optical Character Recognition) programs that convert printed books and documents into machine-readable text. The best OCR programs cost money… usually a lot of money. Adobe Acrobat Pro DC costs about $400 while Omnipage18 and ABBYY FineReader cost about $150 each. These are excellent, high-quality tools for those who can justify that expense. If I needed to scan thousands of pages of text, I would use one of those products. However, most of us only need to scan a few pages. $150 to $400 prices simply are not cost-justified for smaller projects.

Several online OCR products are available free of charge and produce reasonably good results. Any of them might require a bit of manual clean-up once the conversion has been done but such clean-up work seems reasonable give a price tag of FREE. Best of all, none of them require any software installation in your Windows or Macintosh computer.

The folks at the MakeUseOf web site recently conducted side-by-side tests of 4 different free online OCR products and described the results. The report, written by Rob Nightingale, says that “Free Online OCR was definitely the best free tool we tested. That being said, if you’re willing to pay $5 per month for near-perfection, ABBYY’s FineReader Online was slightly more accurate.”

You can read the report at http://www.makeuseof.com/tag/4-free-online-ocr-tools-put-ultimate-test.

I did notice one error in the report. Rob Nightingale claimed, “I decided to use Evernote’s Scannable app (Free on iOS and Android).” Indeed, there is a free Scannable product for iOS but both the Evernote web site and the Google Play Store both say that Scannable is not available for Android. There may be other inaccuracies in the article as well but I believe the bottom-line test results are correct.

7 Comments

After reading Rob Nightingale’s comments about the results obtained using the ABBYY FineReader Online trial, I have to wonder how thoroughly he checked the results. For example, he says:
“Complex Document to PDF
Again, I couldn’t find any errors in this converted file. ABBYY obviously knows how to convert to PDF exceptionally well.”

The obvious question to me is: was the resulting PDF file just a *searchable* version of the originally submitted PDF file? If so it’s hardly surprising that it appeared to be perfect. What should be done is to actually extract the *text* from the file to see how good that looks.

Speaking from experience, as an owner of ABBYY FineReader v12.

Like

Omnipage has a version 19.0

Like

David Paul Davenport March 2, 2016 at 2:41 pm

The real challenge for OCR is newspapers. Variable font styles and sizes makes most digital versions of newspapers less than perfect.

Like

Microsoft’s free OfficeLens will take a photo of a document using a smartphone and save it as a file in several different formats of readable text. It will also align or straighten a page that was not photographed from directly above so it will have 90 degree corners. The cost is free and it works with Windows Phone, Android and iPhone. There are also other free and paid OCR programs available for phones.

Like

I tried three of the online OCR services mentioned in the cited article, as well as the OmniPage Windows program, to convert to a text file a short obituary which had been screen-captured as a JPG file from an online newspaper. The obit was in good but not pristine condition consisting of 135 words in 10 lines of print. Results were as follows:

Free Online OCR – Took almost 3 minutes to reach 100% complete, then ended with message the it is busy and I should try later. This happened in each of three attempts.

i2OCR – Converted in 11 seconds. The result was so full of errors and gibberish that it was unusable.

Online OCR – Converted in 4 seconds. Result had 3 errors plus 10 instances where hyphenated line endings were incorrectly handled.

Omnipage 16 (which is not the latest version of this program) – Converted quickly. Result had 9 errors plus 3 instances where hyphenated line endings were incorrectly handled.

Like

Have about 60 pages of tables of cemetery census information. Pages of pdf are clean, data well tabbed, font crisp and consistent. Using a subset of 3 pages I tried all the free or trial OCR programs with no success. Far too many errors on each page to make these practical.

Like

Leave a Reply

Name and email address are required. Your email address will not be published.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

You may use these HTML tags and attributes:

<a href="" title="" rel=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <pre> <q cite=""> <s> <strike> <strong> 

%d bloggers like this: