reCAPTCHA, another one of the internet’s most innovative projects in now in Google’s grasp.
Not only is reCAPTCHA an effective device for security against spam, it also manages to accomplish a mission, to convert a large volume of printed literature to text.
While most such services generate distorted letters programatically, making them difficult for anyone but a human to discern, reCAPTCHA goes about it a different way.
With reCAPTCHA, every time you prove that you are a human you are effectively helping the process of digitizing printed documents. Instead of using better and better algorithms to generate better distortions which can only be recognized by humans, reCAPTCHA instead uses portions of scanned documents which failed to get recognized by the OCR (Optical Character Recognition) algorithms used to digitize it. Many times these can easily be recognized by humans.
With the support and resources of Google, behind reCAPTCHA, it is possible for the project to reach an even higher gear. With all Google services using reCAPTCHA, and with more resources from Google to make reCAPTCHA more easily available and implementable, it is bound to see in increase in adoption.
Projects like Google Books and Google News Search already use reCAPTCHA to help in digitizing a great volume of scans of old books, magazines and newspapers. Word by word, reCAPTCHA aims to digitize a large part of past documents which right now only exist in print, making more and more of our history indexable, searchable, and accessible to a greater public.
For more information read the post on the Official Google Blog.