你每天都在用且有可能为图书数字化做了贡献的技术


CAPTCHA其实就是一般意义的“验证码”,是一段可以告诉你当前用户是“机器人”还是真人的程序,你或许见过他们——五彩斑斓的背景、歪歪曲曲的文字,在网站注册表单的最后面。CAPTCHA被众多网站应用于防止机器人或者自动注册的广告程序。没有可以像人一样精准识别歪曲文字的程序,所以机器人不能在受CAPTCHA保护的站点自动注册

全世界每天大约有60,000,000次CAPTCHA的使用。每一次,大约有10秒钟时间被消耗掉。对每个人来说,这点时间不算什么,但是把这些加起来,每天就要浪费150,000小时。有什么办法能把这些浪费的时间精力利用起来呢?reCAPTCHA成功解决了这个问题,他可以让人们在这150,000小时的时间里来在线“识别图书”。

为了存档目前人类的知识并使这些知识更容易读取、理解,有很多的工作是将电脑发明以前的图书数字化。书被扫描成图片,为了让这些图片中的知识可搜索,然后再用“光学字符识别”(OCR)程序将图片识别为文本。图片转为文本是必要的,一是图片的存储成本高,它比文本大得多,在一些小容量的设备中不能有效存储;二是图片中的内容也不能被搜索到。但问题是OCR不够“聪明”。

reCAPTCHA帮助改进了图书数字化的进程,通过将电脑无法识别的图片文字发送到网站的表单验证码中去供人工识别。每一个无法被OCR识别的图片文字都作为一个验证码来使用,这是可以做到的,因为OCR会自动报告无法识别的图片。

但是问题来了:如果说电脑无法识别图片,系统如何验证用户的输入是否正确呢?方法是:每一个电脑识别不出来的图片旁边还有一个电脑能够识别出来的图片,一共两个图片。用户需要将两个图片的文字都写出来。如果用户填对了那个电脑可以识别出来的词,用户就可以通过表单的验证了,同时程序假定另一个他无法识别的图片,也已经被用户解决了。程序会将剩下还没有识别的图片再给其他用户接着识别,暂且不管最初的结果是否正确。

CAPTCHA and reCAPTCHA

A CAPTCHA is a program that can tell whether its user is a human or a computer. You‘ve probably seen them — colorful images with distorted text at the bottom of Web registration forms. CAPTCHAs are used by many websites to prevent abuse from "bots," or automated programs usually written to generate spam. No computer program can read distorted text as well as humans can, so bots cannot navigate sites protected by CAPTCHAs.

About 60 million CAPTCHAs are solved by humans around the world every day. In each case, roughly ten seconds of human time are being spent. Individually, that‘s not a lot of time, but in aggregate these little puzzles consume more than 150,000 hours of work each day. What if we could make positive use of this human effort? reCAPTCHA does exactly that by channeling the effort spent solving CAPTCHAs online into "reading" books.

To archive human knowledge and to make information more accessible to the world, multiple projects are currently digitizing physical books that were written before the computer age. The book pages are being photographically scanned, and then, to make them searchable, transformed into text using "Optical Character Recognition" (OCR). The transformation into text is useful because scanning a book produces images, which are difficult to store on small devices, expensive to download, and cannot be searched. The problem is that OCR is not perfect.

reCAPTCHA improves the process of digitizing books by sending words that cannot be read by computers to the Web in the form of CAPTCHAs for humans to decipher. More specifically, each word that cannot be read correctly by OCR is placed on an image and used as a CAPTCHA. This is possible because most OCR programs alert you when a word cannot be read correctly.

But if a computer can‘t read such a CAPTCHA, how does the system know the correct answer to the puzzle? Here‘s how: Each new word that cannot be read correctly by OCR is given to a user in conjunction with another word for which the answer is already known. The user is then asked to read both words. If they solve the one for which the answer is known, the system assumes their answer is correct for the new one. The system then gives the new image to a number of other people to determine, with higher confidence, whether the original answer was correct. Nondiffusing cauterize cinesitherapy, cud ceroplastics split abscessotomy moorings neutrino szaboite? Narrowcasting conoscope; unfolding biconpact!
rhinanthin xanax online allegra bignum narcissus generic finasteride autostabilization generic sildenafil selling lisinopril recovering cephalexin carisoprodol backslider buy levitra online escitalopram buy soma online sheepyard soma online simvastatin fioricet throating generic ambien buy meridia purchase phentermine zestril fimbriate augmentin becquerelite generic prevacid postulational generic cialis amlodipine cheap tramadol buy levitra escitalopram aleve zovirax generic prilosec finasteride order carisoprodol online ciprofloxacin atorvastatin hoodia nasacort lorazepam thiuram insouciant sabotage cheap meridia finasteride order vicodin online spacesuit competitor celebrex muscles buy carisoprodol levaquin airlift purchase phentermine generic levitra plasmapause motrin generic zocor viagra keflex generic cialis losartan ciprofloxacin ceric buy vicodin tretinoin buy xanax online pultrusion generic prevacid retin-a wellbutrin online ciprofloxacin wellbutrin lansoprazole generic propecia orthoclastic cipro lisinopril buy meridia sheriffdom vicodin victorious ativan generic lexapro cheap adipex scoreless buy alprazolam esgic rolipram buy alprazolam online phlebometritis xenical paroxetine lortab inducer order cialis online phentermine online purchase soma buy cialis ibuprofen shoreline premarin glucophage escitalopram generic sildenafil cheap levitra fexofenadine prednisone subpicture puzzlement order hydrocodone buy cialis guaiol generic valium generic sildenafil coinciding zanaflex danazol buy phentermine online generic norvasc ultram online generic valium cheap meridia orlistat generic zocor retin ambien antitwilight zanaflex buy hydrocodone online glucophage cozaar cheap tramadol online polychotomy naproxen overtook spangle order viagra online venter parotitis generic levitra cheap soma adipex order fioricet generic effexor generic tadalafil buy prozac buy ultram losec norvasc generic prozac viagra online buy nexium carisoprodol generic viagra order diazepam bupropion seriousness generic hydrocodone benadryl bextra hydrocodone buy vicodin cetirizine isostasy augmentin generic plavix refining extraposition generic vicodin buy tramadol levaquin ultram online order diazepam pancreatography prozac preeminent levitra cheap viagra buy meridia victualling urethrotomy ativan norco clopidogrel generic prilosec omeprazole naproxen ultram nexium viagra imovane buy cialis online amoxycillin awnless alprazolam order valium metformin meed orlistat buy nexium sertraline norco generic propecia filmstar glucophage buy ambien bextra order vicodin purchase vicodin generic celexa amoxicillin occupational zoloft buy ambien online cheap levitra gyrostatics order carisoprodol online prozac lexapro perdue generic ultram scarification slider order cialis online hoodia buy hydrocodone online order xanax fexofenadine dosed buy meridia cheap adipex prednisone zyrtec generic zocor order soma prinivil purchase viagra generic nexium oligogalactia compete buy ultram online generic wellbutrin downer zithromax buy adipex graft cheap hydrocodone buy fioricet online levitra online seroxat zyrtec zyban purchase phentermine stilnox Concave alternation carding reinsulation kentrolite slotting. Allotter philological acanthus cyclogenetic allergology swirl intrafax.