60

How can I bypass the Google CAPTCHA using Selenium and Python?

When I try to scrape something, Google give me a CAPTCHA. Can I bypass the Google CAPTCHA with Selenium Python?

As an example, it's Google reCAPTCHA. You can see this CAPTCHA via this link: https://www.google.com/recaptcha/api2/demo

7
  • 19
    umm.. Then what's the point of a captcha? Nov 15, 2019 at 7:59
  • I think the only way if you want to bypass a captcha is to use someone else's service. You pass them your captcha, they return you the text. Nov 15, 2019 at 8:02
  • Sounds more do-able. I'm not going to try it. Probably find the coordinates of the checkbox element, send a click. Nov 15, 2019 at 8:17
  • However, how are you getting captchas in the first place? Some of your actions must have triggered google to think you are a robot. Nov 15, 2019 at 8:19
  • @HjSin I will improve that but can you please tell me how to bypass captcha
    – user11960891
    Nov 15, 2019 at 8:21

6 Answers 6

67

To start with using Selenium's Python clients, you should avoid solving/bypass Google CAPTCHA.


Selenium

Selenium automates browsers. Now, what you want to achieve with that power is entirely up to individuals, but primarily it is for automating web applications through browser clients for testing purposes and of coarse it is certainly not limited to that.


CAPTCHA

On the other hand, CAPTCHA (the acronym being ...Completely Automated Public Turing test to tell Computers and Humans Apart...) is a type of challenge–response test used in computing to determine if the user is human.

So, Selenium and CAPTCHA serves two completely different purposes and ideally shouldn't be used to achieve any interrelated tasks.

Having said that, reCAPTCHA can easily detect the network traffic and identify your program as a Selenium driven bot.


Generic Solution

However, there are some generic approaches to avoid getting detected while web scraping:


This use case

However, in a couple of use cases we were able to interact with the reCAPTCHA using Selenium and you can find more details in the following discussions:


References

You can find a couple of related discussion in:


tl; dr

4
  • 3
    Could you please elaborate more on "the conventional Viewport"? What does it refer to? Sep 20, 2021 at 3:59
  • 2
    Changing my viewport to 100, 100 worked for me. Jul 1, 2022 at 18:52
  • I think HTTP cookies are important too. How you read it, keep it and when to clear them can help in some situations to avoid captchas Sep 16, 2022 at 8:36
  • waw... this is a nice step i will try to implement it right away! Thanks @undetectedSelenium you're my saviour!
    – gumuruh
    Feb 11, 2023 at 22:55
20

In order to bypass the CAPTCHA when scraping Google, you have to manually solve a CAPTCHA and export the cookies Google gives you. Now, every time you open a Selenium WebDriver, make sure you add the cookies you exported. The GOOGLE_ABUSE_EXEMPTION cookie is the one you're looking for, but I would save all cookies just to be on the safe side.

If you want an additional layer of stability in your scrapes, you should export several cookies and have your script randomly select one of them each time you ping Google.

These cookies have a long expiration date so you wouldn't need to get new cookies every day.

For help on saving and loading cookies in Python and Selenium, you should check out this answer: How to save and load cookies using Python + Selenium WebDriver

2
  • Hello, I'm new to using cookies; after finding the GOOGLE_ABUSE_EXEMPTION cookie, how do you use that specific cookie? Read the link you provided, but that seems like it saves the cookie from the previous session, rather than a specific cookie value that we already have
    – Yu Na
    Apr 26, 2020 at 2:59
  • Hi, Yu Na! The link shows how to save all cookies and then load them. I tried it in my code and it works like a charm. If there's a specific roadblock you're having, open up a new SO question with your code example and PM me the link so I can see if I can help :) Apr 27, 2020 at 15:12
3

Clear Browsing History, cached data, cookies and other site data First Create an Google Account while you are in browser window opened by selenium. Sign in to your account

wd.get("https://accounts.google.com/signin/v2/identifier?hl=en&passive=true&continue=https%3A%2F%2Fwww.google.com%2F%3Fgws_rd%3Dssl&ec=GAZAmgQ&flowName=GlifWebSignIn&flowEntry=ServiceLogin");
    Thread.sleep(2000);
    wd.findElement(By.name("identifier")).sendKeys("Email"+Keys.ENTER);
    Thread.sleep(3000);
    wd.findElement(By.name("password")).sendKeys("Password"+Keys.ENTER);
    Thread.sleep(5000);

Then Open any website that uses recaptcha tick on checkmark using this code

String framename=wd.findElement(By.tagName("iframe")).getAttribute("name");
            wd.switchTo().frame(framename);
    wd.findElement(By.xpath("//span[@id='recaptcha-anchor']")).click();

You won't find any Puzzles or anything.

1

Bypass as in solve it or bypass as in never get it at all?

To solve it:

  • sign up with 2captcha, capmonster cloud, deathbycaptcha, etc. and follow their instructions. They will give you a token that you pass with the form.

To never get it at all:

  • Make sure you have good IP reputation (most important for Cloudflare).
  • Make sure you have a good browser fingerprint (most important for Distil) - I recommend puppeteer + the stealth plugin.
1

Ok, so there is a simple python script to solve captcha for you.

It basically read the audio and then use google assistant to convert it into text and paste it.

It is only workable in audio captchas which is given the most case with imahe captcha V2

https://github.com/ohyicong/recaptcha_v2_solver

Disclaimer!

I do not write the script, i just get an idea of doing this but got this brother project so, thought to help others through this.

1
  • 2
    As it’s currently written, your answer is unclear. Please edit to add additional details that will help others understand how this addresses the question asked. You can find more information on how to write good answers in the help center.
    – Community Bot
    Jan 23, 2022 at 9:46
0

In case, you have access to config, add SiteKey: 6LeIxAcTAAAAAJcZVRqyHh71UMIEGNQ_MXjiZKhI SecretKey: 6LeIxAcTAAAAAGG-vFI1TnRWxMZNFuojJ4WifJWe

See: https://developers.google.com/recaptcha/docs/faq#id-like-to-run-automated-tests-with-recaptcha.-what-should-i-do

1
  • While this link may answer the question, it is better to include the essential parts of the answer here and provide the link for reference. Link-only answers can become invalid if the linked page changes. - From Review
    – Ben A.
    Sep 1, 2023 at 23:08