New Test Could Keep Spam Bots Out of Web Sites

Researchers have proposed an alternative to those wavy, distorted random words or word-number combinations, called "captchas," that Web site visitors must decipher to complete online transactions.

The new test is based on visual recognition of patterns "emerging" from chaotic backgrounds, and could herald the next generation of captchas. Web users take these tests to authenticate themselves as human operators rather than computers.

Captchas were developed over a decade ago to cut down on the malicious use of automated computer programs, known as bots. Scam artists and hackers can use bots to gain entry into a Web site and send out spam email messages or flood message boards with advertisements, for example.

A digital arms race

Over the years, text-based captchas – the current standard for most heavily trafficked web sites including Google and Yahoo! – have grown more elaborate to stay ahead of machine recognition.

Along the way, however, these captchas have also become increasingly onerous for human users to decipher.

Machine-learning, meanwhile, has more than kept pace, with the optical recognition of the text characters used in captchas now reaching very high success rates.

"The current ones are quite easy to break," said Daniel Cohen-Or, a professor of computer science at Tel Aviv University.

To address this shrinking window of effectiveness for text-based captcha, researchers such as Cohen-Or in recent years have looked to image recognition as an effective weapon in the captcha arms race.

An "emerging" new test

Image recognition in the new test relies on what its creators have dubbed "emergence." This is the ability to visually gather fragmentary bits and synthesize them – often without color clues or clear boundaries – into simple objects, such as a rabbit or a dog.

"This is something only humans can do – to emerge figures out of chaotic patterns," said Cohen-Or, who is a co-developer of the new test.

Given that this emergent mode of visual recognition is still poorly understood in humans, Cohen-Or reckons it will be some time before computer algorithms can be designed to make images "pop" out of apparently formless jumbles of light and dark regions.

More R&D needed

Still, more development needs to be done to work out a suitable setup for an emergent image-based captcha challenge. The set of emergent objects must be sufficiently varied so that bots answering, say, "horse" or "airplane," do not stumble upon the right answer all that often just through random guessing. At the same time, the images must be common enough so as to be broadly recognized by most people surfing the Web.

Overall, the tests will need fine-tuning to remain hard for machines yet relatively easy for humans. As such, Cohen-Or does not claim to have invented a practical captcha design just yet.

Nevertheless, he believes that he and his colleagues have hit upon a promising, fresh approach to ensuring future online security and service.

"The big question is where is the largest gap between humans and bots?" Cohen-Or told TechNewsDaily. "I claim here, [it's] in solving emerging figures."

A paper describing the research was presented at a recent meeting of the Association for Computing Machinery (ACM) in Japan.

Adam Hadhazy is a contributing writer for Live Science and Space.com. He often writes about physics, psychology, animal behavior and story topics in general that explore the blurring line between today's science fiction and tomorrow's science fact. Adam has a Master of Arts degree from the Arthur L. Carter Journalism Institute at New York University and a Bachelor of Arts degree from Boston College. When not squeezing in reruns of Star Trek, Adam likes hurling a Frisbee or dining on spicy food. You can check out more of his work at www.adamhadhazy.com.