23 November 2009
Free Online OCR
I have been a fan of optical character recognition since the early days of the technology. I remember about 15 years ago using a hand-held scanner to try to digitize my files of journal articles and magazine clipppings. The process required a lot of "cleanup" and ultimately proved to be not worth the time.
So it was with some interest that I saw a notice for "free online OCR" - Whether you have a scanned document or a photo, NewOCR.com can analyze the text in any image file that you upload, and then convert the text from the image into text that you can easily edit on your computer. I bookmarked the site and used it several days later when I encountered the story above in a pdf file. I took a screen shot of the page and uploaded the image to the free online OCR site. This was how the image was rendered...
Click to enlarge both for comparison, but the rendering needs work, shall we say. I suppose you get what you pay for.
In all fairness there are certain fonts that are intrinsically hard for OCR to interpret, and the test image had poor black/white contrast. It likely will do better with other challenges.