What is reCAPTCHA
Have you ever been surfing the internet when you come across one of these boxes that says: "I'm not a robot."?
So you check the box and go on your way. But how the heck does this box know whether you're a robot or not and why does it matter? Well, to answer that, we actually have to start with these:
They're called CAPTCHAs: Completely Automated Public Turing Test to tell Computers and Humans Apart.
They were invented in 2003 by Luis von Ahn and his team of researchers at Carnegie Mellon University.
The whole point of these distorted pieces of text was to stop spam on the internet,
like preventing scalpers from writing a computer program that buys every ticket in a fraction of a second.
They work because humans could read the distorted text yet computers and bots can't.
So if you want to stop bots from buying concert tickets or setting up email addresses,
we just have to make filling out a CAPTCHA part of the process.
So fast forward and now millions of CAPTCHAs are being solved every single day by internet users and Von Ahn on started to think: can we do something useful with all this great power?
[And the answer to that is, yes, and this is what we're doing now.]
So they decided to use that brain power to digitize every single physical book we have and the
way to do that is to take real physical books, scan them, and then use optical character recognition
software to translate the words into digital text. What they did was take any words that were too hard for the computer to decipher and
upload them into the reCAPTCHA database.
So going forward, instead of showing random distorted text,
CAPTCHA started to show words from books that computers couldn't understand and when enough people on the internet solving these CAPTCHAs
wrote the same word for a piece of text shown, that word would be confirmed and uploaded to an ebook database.
Von Ahn called this project
reCAPTCHA. Their slogan was "stop spam, read books". At this point, a hundred million reCAPTCHAs were being solved every day, the equivalent of
2.5 million books a year.
So Google was like: "let's acquire reCAPTCHA".
And they did, in 2009, and they used that brainpower to digitize all the New York Times archives since
the 1800s, as well as all of Google books.
And when they ran out of those, Google started giving people street numbers from Google Street view to help label Google Maps.
so everything worked out happily ever after, but not really, because there were a couple of problems.
The first is that even though reCAPTCHAs work, they weren't too accessible, so blind people had a much harder time
filling out forms and signing up for things on the internet.
So they made audio reCAPTCHAs as well that was something like this:
[Six]
[Four]
[Zero]
[Nine]
But regardless, reCAPTCHAs became a burden for people with dyslexia, poor hearing, poor sight, as well as other sensory impairments.
The other problem was that paid services started popping up that solved CAPTCHAs for you.
The services work because they took your CAPTCHAs and shipped them off to CAPTCHA Farms in third-world countries
where workers would be paid dirt cheap to solve your CAPTCHAs and ship them back to you, the client.
And the last problem, which is perhaps the most important, was that
computer vision technology was becoming so good that bots were starting to solve these CAPTCHAs and get through.
So engineers got to thinking and thought: "why not make CAPTCHAs harder to solve?
So they made CAPTCHAs have more twists and turns and added some noise and threw in random lines,
but as time went on the technology caught on and bots were once again getting through.
So google decided to do some research and they found that humans got these complex complicated captions right
only about 33 percent of the time and
their advanced computer technology at Google was getting them right 99.8 percent of the time.
Shoot, that computer vision technology was on a
So Google decided to change things. They got rid of the distorted text CAPTCHA and they came up with this:
And they called it
"No CAPTCHA reCAPTCHA". When you click it, it sends over an hTTP request to Google with a whole bunch of useful information.
Things like your IP address, your country, a timestamp.
Information from your browsers, such as the way you move your cursor just moments before entering the checkbox.
How you were scrolling the page before the click, the time interval between different browser events, and many other
variables that Google will keep secret.
All these criteria are then processed by a machine
learning risk analysis engine at Google and most of the time the information can tell the difference between a human and a bot.
but if the risk analysis engine still isn't sure, then for a small percent of users they'll often complete an additional challenge.
An image recognition CAPTCHA. Something like picking all images with a sandwich in it
And if you prove that you're a human once this way
then chances are Google's engine will remember. And next time after clicking that check box, you'll be able to pass right through with ease.
Thanks for reading! if you enjoyed it then Upvote,Resteem And Follow!!!
Upvoted & Followed you.
Need same fevor from you on...
Billionaire Investor Marks, Who Called the Dotcom Bubble; says Bitcoin is a 'Pyramid Scheme'
https://steemit.com/bitcoin/@raviraj4you/billionaire-investor-marks-who-called-the-dotcom-bubble-says-bitcoin-is-a-pyramid-scheme
Thankyouthankyouthankyou -- I had wondered how the hell just ticking a box counted as a CAPTCHA but never thought to actually do the proper research. Great work. Have an upvote and a resteem.
Actually, the mechanics of that are kind of scary, aren't they. Google is just ... vacuuming up so much information about us.
thanks for thr upvote ane resteem!!!!
and yes its pretty scary that google is vacuuming so much information about us..
Thank you! I've actually always wanted to know more about captchas and it was more interesting even than I realized. I guess I must be mistaken for a bot fairly often because I get those additional captchas all the time and I can't stand them! What I don't like is it's usually something like "select all the slides with street signs in them" and there are usually 2-3 slides that have just a tiny edge of the street sign sticking into them. It's often very small and subtle and I don't know whether they expect me to select those slides or not. I always do, and that must be considered wrong at least sometimes because they will then serve me a second captcha and even a third! Sometimes I just give up on whatever I was trying to view because the annoyance quotient now outweighs the reward.
Congratulations @randomguy123! You have completed some achievement on Steemit and have been rewarded with new badge(s) :
Click on any badge to view your own Board of Honor on SteemitBoard.
For more information about SteemitBoard, click here
If you no longer want to receive notifications, reply to this comment with the word
STOPThat's so cool and interesting and I had no idea!