Jump to content

Talk:CAPTCHA

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

This is an old revision of this page, as edited by User24 (talk | contribs) at 03:06, 19 January 2007 (Spam image). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Aural v. Oral!

aural=oral? Or am I just too american? Ilyanep 14:39, 18 Jun 2004 (UTC)

"aural" means of the ear; "oral" means of the mouth. Aural captchas are captchas you listen to, as opposed to visual ones. Marnanel 14:57, 18 Jun 2004 (UTC)
Yeah, but it's given to you orally, that's the confusion factor, thanks for the help Ilyanep 17:41, 18 Jun 2004 (UTC)
Aural captchas need not have been anywhere near anyone's mouth, though... Marnanel 18:12, 18 Jun 2004 (UTC)

It's not just visual impairments that can make captchas unusable-- entering a long string of random characters is hard enough for someone with dyslexia, even without the added distortion that most of these scripts use. I'm notorious for transposing digits when copying down numbers...

Also, the choice of font can be crucial. What if a 1 looks too much like a 7? And what about one versus ell, zero versus oh, and so on? --Codeman38 16:18, 11 Oct 2004 (UTC)

The "free porn" weakness?

As much as we all hate spammers, you got to give 'em credit for using free porn to break captchas. I just cant get over how brilliant. Still, that free porn cant be accesed by the blind, dyslexic, or elderly. Poor ppl The bellman 13:01, 2005 Apr 23 (UTC)

Has anyone demonstrated this "free porn" scheme actually being used? Cory Doctorow's proposal was theoretical, not evidence of an actual implementation.

One crucial flaw with this method of defeating captchas is that the "free porn" site presenting the borrowed captcha does not even know the correct answer/verification code itself, so it would invariably allow access regardless of the answer/code given. 150.101.115.231 22:51, 27 October 2005 (UTC)[reply]
If the porn site gave the wrong answer to the email site, wouldnt the email site have some response to indicate that it was wrong? I dont think that the porn site would necessarily allow access regardless of answer/code given.

Furthermore, most current captchas can be broken without much effort by using OCR or trivial image comparison techniques, so there's little point.

With this in mind, as well as the need for audio support for blind users, the Implementations list of captcha generators perhaps should be split into two? One listing "Visual Only" implementations, and one listing those that include audio implementation. Currently it relies on each entry haphazardly mentioning audio support. If not two lists, then perhaps a table with a column for audio support to have yes/no inserted for each implementation? 150.101.115.231 22:51, 27 October 2005 (UTC)[reply]

And the "free porn" approach is easy to circumvent; place a short timeout on the captcha before it becomes invalid and the user has to try a different one (like, say, 10 seconds).

I'm not saying captchas can't be cheaply circumvented; if you want to do it badly enough, hire sweatshop workers at $3/hour. Free porn ain't the way.

I don't know if the "free porn" method is currently used, but it's definitely technically possible. Say a spammer wants hotmail accounts. He/she/it sets up this "free porn" site with a captcha relay. When a visitor wants porn, the spammer's site visits hotmail, grabs a hotmail captcha (maintaining the proper cookies), and relays it to the visitor. The human solves the captcha and gives the spammer's site the answer. The spammer's site relays the answer to hotmail along with any relevent account registration info, and hotmail confirms or denies their answer (if it's confirmed, presto - they've got a new account). This confirmation or denial is then relayed back to the human, who reacts naturally. Alternatively, the spammer could present a fake denial several times, or even indefinitely, and get 3 or 4 captchas solved with each visitor until they realize that they'll never, in fact, be getting any porn. My point is, whether it works well or not, it takes practically no work to operate once it's set up. I think this is a legitimate concern worthy of presentation as a weakness on the article page. Courtarro 19:26, 17 March 2006 (UTC)[reply]
I love the idea of "free porn" being the secret to CAPTCHAs. Please keep :) Mathiastck 16:30, 28 August 2006 (UTC)[reply]

people without sight how can you register online????

people without sight how can you register online????

Usually you simply cannot. But some websites provide an audio Captcha, or a way to interact with a human operator in order to prove that you are human. Sam Hocevar 06:56, 26 Apr 2005 (UTC)

I added the origin section a few days ago.. much of it comes from research, but parts are from andrei broder's talk at a workshop that I attended (similar info from another attendee) Matt Casey 21:32, 17 August 2005 (UTC)[reply]


What is the source of the example CAPTCHA image? The distortion is so extreme that I'd have trouble reading it in a real situation, and the gradient is obviously differentiable from the letters, making it useless. I think a better example could be found. MrVacBob 03:26, 17 December 2005 (UTC)[reply]

I have no trouble reading the image, and I feel it is typical of good captcha. That MrVacBob (and probably many others, too) has trouble reading it only underscores the unfairness of captcha. David 15:53, 17 February 2006 (UTC)[reply]

Invention credit

How come of the people involved in the CAPTCHA project at CMU, the ones that have pages are Manuel Blum, Nick Hopper, and John Langford, while Luis von Ahn who had more to do with the project than the last two does not have a link from the site, the guy was featured in a NY Times article with Manuel Blum, as opposed to the other two, it seems to me that he has more merit of a stub than the other two. Just wondering. -- Jorge Vittes 17:36, 17 Dec 2005 (PST)

This is the nature of research. Once a scientist is famous, they manage other folks rather than do lower-level work. There's a great photo of William Shockley smiling and sitting at a microscope (which he hadn't used before the photo shoot) while his underlings, who did the actual work of inventing the transistor, stand around and look frustrated and useless. Compare the length of those three men's Wikipedia articles.--Joel 22:51, 9 May 2006 (UTC)[reply]

Article quality

The percentage of this article given over to external links instead of content ain't great...--BozMotalk 11:15, 5 January 2006 (UTC)[reply]

agreed. It's useful to have some links, but the amount currently there is a bit silly. imagine the same happening on the 'guestbook' page. Suggest either a separate page, or that someone goes through the list and decides which are the (two?) best for each language. I would do it, but I wrote one of the PHP ones, so obviously I'm biased. (user24)

Is there a Wiki Captcha extension?

Is there a Wiki extension for Captcha?

Not on the English Wikipedia. I have accounts on other Wikipedias and of the 6 I did 4 of them had a CAPTCHA when I registered to keep some else from using my name, [1], [2], [3], [4] --Ávril ʃáη 04:48, 21 July 2006 (UTC)[reply]

Does the use of captcha violate civil liberties?

Most people these days agree that discrimination against people for jobs or housing on the basis of skin color is a violation of Civil rights and Civil liberties. Yet widespread discrimination against people who are blind or visually impaired is politely noted and (usually) ignored.

Examples are captcha (see the Accessibility section of this article), inaccessible voting (usually one has to get a sighted friend or relative to vote for them), unequal access to education (textbooks are frequently not available in braille or large print in time for the classes that use them), and the barring of service animals from restaurants and other public places.

Some of these are civil rights violations, and some civil liberties violations, and some just plain inconveniences, but don't they deserve some real attention? Why can major website companies, such as Google and Yahoo, use purely visual captcha without serious challenge from society? Why is no one developing a challenge-response system that is text only?

I have no answers, only questions.

David 15:53, 17 February 2006 (UTC)[reply]

This is not really something for Wikipedia to decide. If you are worried about the accessibility of Captchas, but nobody else is, then you should start a separate forum to discuss this. Quarl (talk) 2006-02-17 22:35Z

The "PWNtcha" site linked to includes a goatse image... it should probably have a warning or some such. (There's also been questions involving its legitimacy, that I haven't done the research on yet.) Is there a standard warning for "potentially hazardous to your retinas" images, or a policy on this? --Piquan 00:46, 9 March 2006 (UTC)[reply]

If by legitimacy you mean whether the goatse image was intentionally put there by sam.zoy.org, check out http://sam.zoy.org/goatse/

If you wonder whether pwntcha is actually a fake, there's now an online demo that should allay your fears. --User24

Thanks for the info. The legitimacy question was mostly a sidenote; as I said, I hadn't done the research. I'm okay with the idea that a legitimate software demo includes an offensive image; I just think that viewers should be warned before visiting pages with shock images. --08:49, 20 March 2006 (UTC)

Article title

This should be "CAPTCHA" rather than "Captcha", shouldn't it? —Ashley Y 06:14, 11 March 2006 (UTC)[reply]

I think that's a technical limitation of the wiki software, but yeah, it should -User24 27 March 2006

Backronym

Why is CAPTCHA a backronym?? That would mean the abbrevation had another meaning before? Or is it derived from capture? --Abe Lincoln 10:16, 21 April 2006 (UTC)[reply]

It is not. It is a contrived acronym. --FOo 03:25, 22 April 2006 (UTC)[reply]
Contrived acronym redirects to backronym. --Amit 08:42, 5 October 2006 (UTC)[reply]

For this article?

File:Crazymerit.JPG
WP CAPTCHA

I uploaded this image awhile ago. Would including it here be too self-referential (a little bigger, obviously)? Is another image even needed? Thanks. --LV (Dark Mark) 23:57, 26 April 2006 (UTC)[reply]

There was a note stating that an algorithm related to CAPTCHA may be patented. It was recently changed to "copyright", with an edit note that algorithms and source code are copyrighted, not patented. While I do agree that source code is generally copyrighted, my understanding is that algorithms can be patented, at least in the US. Famous examples include XORing images U.S. patent 4,197,590, the LZW compression patent (for which there were two distinct holders, but I only recall U.S. patent 4,558,302), and the Fraunhaufer MP3 patent. While this practice is disputed (most strongly by the League for Programming Freedom), software patents are available in the US. See software patent for more discussion. I've reverted the edit, but put this note in to explain my reasoning. --Piquan 00:08, 3 May 2006 (UTC)[reply]

Did paypal invent this?

Did paypal invent the online thing where there are blurry numbers and letters and it takes 5-10 tries before a human guesses them right (not the leetspeak ones in this article)? In the book paypal wars, Eric M. Jackson thought paypal did. But I don't know. I had added it long ago to Auction_sniping. I don't know when this message will be seen by someone who knows but if paypal didn't, then the phrase "that Eric M. Jackson stated was invented by Paypal" should be removed from Auction_sniping. DyslexicEditor 21:32, 24 May 2006 (UTC)[reply]

Yes, PayPal invented this sometime in 1999 or so. It was called the "Gausebeck-Levchin Test" at the time, after its creators, and was originally used to prevent fraudsters from signing up multiple accounts using automated scripts. The numbers were not blurry, but rather placed on an inconsistently broken grid with small gaps to foil automated OCR programs. 204.15.20.244 01:30, 14 September 2006 (UTC)[reply]

I removed this link: custom.programming-in.net/articles/art20-turingNumbers.asp

(1) it is not fully client-side (the image is generated on above site) (2) it does not protect at all, since it is an onclick event on pressing the submit button, so it will only work in a browser, not in a script that just submits form data.

Han-Kwang 08:58, 8 June 2006 (UTC)[reply]

It was added by a pretty notorious programming-in.net spammer. Full support. Haakon 09:23, 8 June 2006 (UTC)[reply]


OCR :( :( :(

I tried to implement a captcha on my website, but bots are smart enough to crack it. Maybe the OCR is too hightech nowadays. I want to try a different menthod, like maybe a question in English 'What is the third letter of the word hippopotamus?' and you have to type in 'p' or 'P'... would that work better? --Sonjaaa 12:04, 8 June 2006 (UTC)[reply]

Not only would it work better, but it would treat blind and visually impaired people fairly. 141.157.163.11 13:54, 24 June 2006 (UTC)[reply]
As long as there is a predictable pattern to the question, it can be cracked. --Amit 08:49, 5 October 2006 (UTC)[reply]
There cannot be a perfect solution, since spammers can hire people to process CAPTCHAs, and since any puzzle solvable by a human can be (theoretically) solved by a computer system. Only when the technologies distinguish between people and bots throughout the development cycle in a way that cannot be faked (probably using cryptography) can the distinction be relied upon. David 22:44, 2 January 2007 (UTC)[reply]

I think the external links section is getting out of hand. Wikipedia is not a collection of links. A large number of them can be defeated by a trivial noise-removal algorithm (color filter or dot removal) followed by standard OCR. A few sites don't even bother to show an example. Since there are so many free implementations available, I think we can be a bit more selective here.

Unless someone can give me a good reason to keep the links, I will remove the links below a few days from now. Han-Kwang 19:56, 24 June 2006 (UTC)[reply]

Replies copied from HK's Talk page:

You are absolutely right in your assessment of the external links on the CAPTCHA page. I would do exactly the same....so feel free to trim those links off that you talked about. That would be fantastic.

--Ownlyanangel 00:21, 26 June 2006 (UTC)[reply]

I would kill all of the implementation an services links citing WP:NOT a web directory. Replace them all with a link to something that is really a web directory, like Dmoz. --GraemeL (talk) 00:36, 26 June 2006 (UTC)[reply]
I see, that's even more rigorous. Is there any good web resource on CAPTCHAs? DMOZ is not very helpful for this type of things, since DMOZ in principle only indexes complete sites, rather than sections of sites. (Not to mention that it usually has a 6-month backlog) Of course, I can start my own CAPTCHA index page (outside Wikipedia), but it would be kind of a conflict of interests. :-) Han-Kwang 13:41, 26 June 2006 (UTC)[reply]
Attempting to find a site or two that lists, and perhaps reviews, a large number of implementations would still be an advantage. Even the cut down list below would still be around half the size of the actual article text and would likely grow again over time. --GraemeL (talk) 13:54, 26 June 2006 (UTC)[reply]

I just removed the links listed below. As soon as I find a good reviewed CAPTCHA index page I will remove the other links to implementations as well, but I haven't been searching yet. Han-Kwang 21:49, 1 July 2006 (UTC)[reply]

PHP section

.NET

Three of the four links seem to be basically the same Captcha engine. One website appears 3 times in the list:

Classic ASP

Java

Coldfusion

C

Perl

Python

Ruby

Smalltalk

Lasso

CAPTCHA services

Remove all links. It makes no sense to outsource a captcha to a different server since form verification has to be done on your own server. If you can do the latter, you can use one of the free implementations above as well. Moreover, these are not really CAPTCHAs in the sense that the generating algorithm is not disclosed. So only keep the ones that offer something extra and remove the following links:

Han-Kwang 19:56, 24 June 2006 (UTC), update 25 Jun[reply]

External links are a little messed up, but the defeating links are valuable in learning more about Captchas. Esp. when developing them ( to avoid pitfalls of creating ones that are easily defeated ) (209.87.176.132 17:25, 18 October 2006 (UTC))[reply]

E-mail distinguisher

This site has a captcha which makes you solve an equation. The bellman 15:42, 25 June 2006 (UTC)[reply]

It took me maybe 20 minutes to write a perl script to defeat this thing, including the time to look up the Perl API for HTTP requests. After testing it a couple of times I realized that you probably aren't related to the above website (which seemed to contain a big wikipedia link). Han-Kwang 17:05, 25 June 2006 (UTC)[reply]
I am the creator of the utility mentioned above, which is described in full on this page. It is not a CAPTCHA and makes no claim to be one. It is a form and CGI program which allows visitors to a site to send E-mail to an address which is nowhere disclosed on a Web page, requiring them to first solve a problem. Obviously, one can write a program to solve such problems—that is how the feedback form works itself! The purpose of the program is not to distinguish computers from people, but rather people whose mail is likely to be worth reading from idiots. --John Walker (fourmilab.ch) 17:46, 25 June 2006 (UTC)[reply]

Only one T for "Turing test to tell"‽

Wouldn't "Turing test to tell" generally become three T's under the usual initialism rules, with only "to" being optional? This acronym seems a bit too loose to be true. I suspect it's actually Completely Automated Procedure to Tell Computers and Humans Apart. It's not exactly a Turing test, since there's no human judge; maybe that bit was just thrown in during backronymization to impress non-experts. SeahenNeonMerlin 22:22, 2 July 2006 (UTC)[reply]

See www.captcha.net[5], in the first publication listed. I've worked for Aladdin before, they have a history of coming up with acronyms that select letters to come up with what they want. --JVittes 22:44, 2 July 2006 (UTC)[reply]
Considering it has been almost a week and there is no further discussion I'll remove the disputedAssertion tag soon. --JVittes 17:27, 8 July 2006 (UTC)[reply]

Why use trademark, not generic term?

Since CAPTCHA is registered as a trademark, why is it used in this article as the generic term for these programs? The community should come up with an unencumbered name and use that. A direct, un-cute name would be fine.

For what it's worth, CMU's case for the trademark is weakened by the sort of usage that's going on here. That's fine by me. I dislike the annexation of portions of our natural language namespace without good reason. — Preceding unsigned comment added by Knackers (talkcontribs)

They are also called HIP, for 'human interaction proof', but this word has a wider meaning than Captcha. I can't find anything on the web about CMU complaining about the use of the word. Han-Kwang 03:27, 16 July 2006 (UTC)[reply]


Accessibility

The article says:

Even some of the demo CAPTCHAs at the software sites listed below are indecipherable to many if not all humans.

I assume that "software sites listed below" is referring to the the links at the bottom of the article. However, I'm not sure which of the links is a "software site". It certainly isn't all of them--I only fournd one that has more than one sample (Breaking a Visual CAPTCHA), and all the CAPTCHAs there were relatively easy to read. (I recently saw one somewhere--I don't recall where--that took me five tries to get right :-(.)

That phrase refers to a section in an older version of the article that contained external links to implementation. I removed the entire section since it was attracting too much linkspam, intentional or not. I'll reword the phrase. Han-Kwang 18:46, 10 August 2006 (UTC)[reply]

Also, in the sample CAPTCHA at the beginning of the article, the caption says:

This CAPTCHA of "smwm" obscures its message from computer interpretation by twisting the letters and adding a background color gradient

What color gradient?

--69.140.23.118 16:06, 2 August 2006 (UTC)[reply]

There is a gradient. If it is not showing up on your monitor, your system may not be in the highest standard color resolution. David 22:47, 2 January 2007 (UTC)[reply]

Paying the human operators with access to pornography instead of money has also been considered.

That line needed some sort of comment. Keep, notable ;) Mathiastck 18:35, 22 August 2006 (UTC)[reply]

I like the leet speek example

Hey I just readded the leet speek example. I think the leet speak example aptly explains why CAPTCHA's work. It's easy for humans to recognize text that machines have more trouble with. Mathiastck 16:29, 28 August 2006 (UTC)[reply]

Should this be in Guessing Games?

I don't think CAPTCHA should be in guessing games even though it is a bit of a guessing game sometimes.

Why not? Nothing wrong with being in a lot of categories. The category system is underused. Mathiastck 12:07, 3 September 2006 (UTC)[reply]

from user24: freeCap has been turned into exactly that - a children's guessing game! check out http://www.jambav.com/jambav/flashy/cap/index.php?source=gamepage

Deafblind?

How would one who is deafblind access the internet in the first place? --66.220.237.102 15:10, 6 September 2006 (UTC)[reply]

With a braille terminal or some other accesibility device. —Keenan Pepper 22:21, 6 September 2006 (UTC)[reply]

Spammers comment is misleading

The following is misleading in the context it appears in: "but the technology can also be exploited by spammers by impeding OCR detection of spam in images attached to email messages." These distorted messages that appear in spam messages are not an application of Captcha. The only way they are related to Captcha is that both use the same technology, so this comment - if it appears at all - belongs in a different section. —The preceding unsigned comment was added by Mahemoff (talkcontribs) 06:29, 7 December 2006 (UTC).[reply]

Statistics Regarding Deaf/Blind

The accessibility section features a section which contains several statistics regarding the deaf/blind. This strike me as unnecessary and is not useful to the article, since these statistics do not tell us anything about CAPTCHA itself. We know that deaf/blind people exist - the specific numbers of people in the UK who are both deaf and blind is not pertinent to CATPCHA itself. Would anyone take issue with me deleting these lines from the article? —The preceding unsigned comment was added by Vonkwink (talkcontribs) 07:16, 6 January 2007 (UTC).[reply]

Is this a site used to solve CAPTCHA?

The article talks about a «[...] technique used consists of using a script to re-post the target site's CAPTCHA as a CAPTCHA to a site owned by the attacker, which unsuspecting humans visit and correctly solve within a short while for the script to use.» I found this site, http://mortgage-and-remortgage.info/, while surfing the Internet. I don't know, but is this site using unsuspecting humans visit to correctly solve CAPTCHA? If so, would be of any interest using the site for the article, like using a screen shot of the site in the article? Would that be a copyright violation? Jayme 18:45, 6 January 2007 (UTC)[reply]

HTML encoded Captcha

I've just copy-edited a new article on HTML encoded Captcha, but it has only one primary source and I would suggest to evaluate it's content and - in as far notable - merge it here. Tikiwont 14:03, 11 January 2007 (UTC)[reply]

Has anyone ever tried ascii Captcha (IE, like the example ascii art message at http://www.network-science.de/ascii/)? If different fonts are used, it would be hard to decode yet wouldn't take up much HTML, and it could be based in javascript. 24.107.235.184 21:07, 13 January 2007 (UTC)[reply]
I don't think it's notable (esp. given that it only refers to one implementation), and as a 'CAPTCHA' doesn't necessarily need to be image based, there's nothing fundamentally different about HEC. Also the implementation linked is very weak; see the discussion on /. and the link in the HEC page. Sounds like advertising to me. The idea that html is less OCR-able than jpg is totally flawed; it doesn't take long printscreen. —The preceding unsigned comment was added by 194.80.176.253 (talk) 09:55, 14 January 2007 (UTC).[reply]

Spam image

I'm pretty sure that the purpose of the colored streaks in the spam image is not to defeat OCR, but to add some randomness to each picture so spam recognition software can't fingerprint and filter them. --CyBot 17:34, 14 January 2007 (UTC)[reply]

Both things are the same, at least from my view point.
they are not the same at all. While totally ineffective at stopping OCR, the obfuscation is perfect for avoiding, say, md5 hash based spam image checks. I would bet money on this being the motivation for them adding these streaks. --User24 03:06, 19 January 2007 (UTC)[reply]

Merge

This article was nominated for merge. I agree. Nothing more to say :) (I don't think it is needed an article for every type of captcha.

Nethac DIU, would never stop to talk here
18:41, 15 January 2007 (UTC)[reply]
We have a very fast bot. I fell in edit conflict to put the sign, 5 seconds after.

The page that has been proposed for merging into this one looks like an advertisement. ("no production website has ever used it.") Gazpacho 05:35, 17 January 2007 (UTC)[reply]