Captcha challenges are a common headache when trying to automate web interactions using Selenium. Thankfully, anti-captcha services provide a straightforward way to bypass captcha protections programmatically. This guide will walk through the key steps to get around captchas using Python, Selenium, and Anti-Captcha.
Overview of Captcha and Anti-Captcha Providers
Captchas (Completely Automated Public Turing tests to tell Computers and Humans Apart) are utilized on many websites to prevent bots and automated scripts from exploiting services. They typically require users to decipher and respond to visual prompts to verify they are human.
{screenshot suggestion of captcha example}
Anti-captcha services use real humans to solve captcha tests behind the scenes. For a small payment, they will interpret captcha images or audio prompts and return the correct solution to your code. Well-known anti-captcha services include Anti-Captcha, 2Captcha, and DeathByCaptcha.
Using anti-captcha services, you can automatically send captcha challenges to be solved and bypass the protections they enforce.
Retrieving the Captcha Site Key
The first step is to retrieve the site key or data-sitekey attribute from the captcha code on the page. This identifies the specific captcha for the anti-captcha service to target.
Using Selenium in Python, you can extract the captcha site key like this:
captcha_site_key = browser.find_element(By.XPATH, '//*[@id="recaptcha-demo"]').get_attribute('outerHTML')
cleaned_site_key = captcha_site_key.split('" data-callback')[0].split('data-sitekey="')[1]
The site key is parsed from the element's outer HTML and cleaned up using split() operations.
Configuring the Anti-Captcha Client
Next, instantiate the anti-captcha solver, set your API key, the target website URL, and the cleaned site key:
solver = recaptchaV2Proxyless()
solver.set_verbose(1)
solver.set_key(os.environ["anticaptcha_api_key"])
solver.set_website_url(captcha_url)
solver.set_website_key(cleaned_site_key)
The
Solving Captcha and Inserting the Response
With the client configured, you can programmatically solve the captcha like this:
captcha_response = solver.solve_and_return_solution()
if captcha_response != 0:
print("Captcha responded: "+captcha_response)
else:
print("failed with error: "+solver.error_code)
If successful,
browser.execute_script('document.getElementById("g-recaptcha-response").innerHTML = arguments[0]', captcha_response)
This places the response in the appropriate form field to mimic human input.
Submitting the Captcha-Protected Form
Finally, you can locate and click the submit button to complete the captcha-protected form submission:
browser.find_element(By.XPATH, '//*[@id="recaptcha-demo-submit"]').click()
And that's it! The anti-captcha service will seamlessly solve the captcha challenges for you, enabling automated form submissions.
Here is the full code example with descriptive variable names:
from anticaptchaofficial.recaptchav2proxyless import *
from webdriver_manager.chrome import ChromeDriverManager
from selenium import webdriver
from selenium.webdriver.common.by import By
import os
browser = webdriver.Chrome(ChromeDriverManager().install())
captcha_url = "https://www.google.com/recaptcha/api2/demo"
page = browser.get(captcha_url)
time.sleep(10)
captcha_site_key = browser.find_element(By.XPATH, '//*[@id="recaptcha-demo"]').get_attribute('outerHTML')
cleaned_site_key = captcha_site_key.split('" data-callback')[0].split('data-sitekey="')[1]
print(cleaned_site_key)
solver = recaptchaV2Proxyless()
solver.set_verbose(1)
solver.set_key(os.environ["anticaptcha_api_key"])
solver.set_website_url(captcha_url)
solver.set_website_key(cleaned_site_key)
captcha_response = solver.solve_and_return_solution()
if captcha_response != 0:
print("Captcha responded: "+captcha_response)
else:
print("failed with error: "+solver.error_code)
browser.execute_script('var element=document.getElementById("g-recaptcha-response"); element.style.display="";')
browser.execute_script("""document.getElementById("g-recaptcha-response").innerHTML = arguments[0]""", captcha_response)
browser.execute_script('var element=document.getElementById("g-recaptcha-response"); element.style.display="none";')
browser.find_element(By.XPATH, '//*[@id="recaptcha-demo-submit"]').click()
time.sleep(10)
Using these techniques, you can leverage anti-captcha services to bypass captcha protections in your web automation scripts. The human solvers handle the challenges behind the scenes, removing the captcha obstacle.
Rather than building and managing your own captcha solving infrastructure, services like Proxies API handle all of this complexity for you.
With Proxies API, you make a simple API request with the target URL. It will handle:
And return the rendered HTML. No need to orchestrate the numerous steps required for reliable captcha solving.
For example:
curl "http://api.proxiesapi.com/?key=API_KEY&render=true&url=https://targetpage.com"
This takes care of all the headaches of automation. No proxies, browsers, or captcha solving services to manage.
Proxies API offers 1000 free API calls to get started. Check it out if you need to integrate robust captcha solving and proxy rotation in your projects.