As residents of the “Digital Age,” I’m sure that all of us have run across something like this at one time or another during our day-to-day internet usage:
A vaguely inconvenient, highly annoying, and sometimes frustrating online security measure, designed to ensure that we are human and not some sort of automated “bot” out to spread spam.
Its technical name is CAPTCHA, which stands for Completely Automated Public Turing Test to Tell Computers and Humans Apart, and it was created at Carnegie Mellon University in 2000. The general idea is for a user to prove their humanity by deciphering and retyping two words, which have been somehow distorted to be unreadable to computers.
This, in and of itself, is a fairly clever idea. Bot sees CAPTCHA, bot can’t decipher CAPTCHA, and we’re all kept safe from the evils of bot spam. However, the Carnegie Mellon School of Computer Science took it a step further, in terms of creativity, with the reCAPTCHA program.
reCAPTCHA puts a basic online security program to a dual, dare I say novel, use: digitizing old books, newspapers, and radio shows. Yes, every word that you type into certain CAPTCHA programs is a contribution to the archiving of these texts. For example, reCAPTCHA is currently helping digitize old editions of the New York Times.
To be digitized, texts are photographically scanned and made into text, which is easier to store and cheaper to download than scanned images. To make an image into text a program called Optical Character Recognition (OCR) is used. However, some scanned words are too old, too damaged or just too oddly written for OCR to recognize. All such words are sent to reCAPTCHA and used in security tests to be deciphered by the ultimate text decoding machine: humans. To ensure that the word has been decoded correctly by the human user, most CAPTCHAS consist of one unknown word and one that has already been correctly deciphered by the computer. If you reproduce the known word correctly, the computer takes your word for it on the unknown. With about 200 million CAPTCHAs being solved every day, I would say that it’s a pretty efficient way to deal with holes in text recognition software.
Sure, reCAPTCHA didn’t invent anything BRAND NEW and it probably won’t completely revolutionize the world (and honestly, it’s kind of a pain in the butt), but it did turn a mundane task into a creative solution.