The following is a Plus Edition article, written by and copyright by Dick Eastman.
Scanning a page from a book creates a picture of the page. However, a picture is not easily searchable. The image is similar to taking a picture with a digital camera: while it is easily readable by a human eye, the computer cannot "see" the words in the picture. A conversion process, called Optical Character Recognition, is required.
Optical Character Recognition, usually abbreviated to OCR, is the mechanical or electronic conversion of scanned images of handwritten, typewritten or printed text into machine-encoded text. For this article, I will ignore handwritten text as that is a much different process with its own challenges. Most genealogists are concerned with converting typeset books to computer text that can be searched.
The OCR process is simple in theory. When a printed page of text is scanned, the scanner delivers an image of the text to OCR software stored in the attached computer. The software then attempts to identify each letter of each word in the image in order to convert it to an editable text document or to convert the information into whatever format is needed.
Converting a picture of a word into the computer text equivalent of the same word is a much more complex process than one might think. If you are aware of the strengths and weaknesses of the conversion process, you can better understand the search process when looking for information. That understanding can result in better results when you understand what works and what does not.
The remainder of this article is for Plus Edition subscribers only. SUBSCRIBE NOW to read this article.
If you have a Plus Edition user ID and password, you can read the full article right now at no additional charge in this web site's Plus Edition at http://eogn.com/wp/?p=23014. This article will remain online for several weeks.
If you do not remember your Plus Edition user ID or password, you can retrieve them at http://www.eogn.com/wp/ and click on "Forgot password?"
If you decide to subscribe to the Plus Edition right now, you will be able to immediately read this article online. What sort of articles can you read in the Plus Edition? Click here to find out.
For more information about subscribing to the Plus Edition of Eastman's Online Genealogy Newsletter, visit http://blog.eogn.com/eastmans_online_genealogy/plusedition.html.

Recent Comments