Ready to level up your Python requests skills? Setting the user agent is one of the most important things you need to do to make your requests look legit. In this guide, we'll cover everything you need to know about user agents in requests to help you become a pro!
What's a User Agent?
A user agent is a string that identifies the application, browser, and operating system making a request to a web server. Here's an example:
Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36
This tells the server that the request is coming from Chrome browser version 74 on Windows 10.
By default, Python requests identifies itself in the user agent:
python-requests/2.22.0
That's a dead giveaway that you're not a real browser! Many sites block or throttle obvious bot traffic, so we need to set a real browser user agent.
Setting a User Agent in Requests
Setting a user agent in requests is simple - just pass it as a header. Here's an example:
import requests
headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'}
response = requests.get('<https://www.website.com>', headers=headers)
This will make your request look like it's coming from a desktop Chrome browser.
For convenience, you can also set headers at the session level so they are applied to all requests from that session:
session = requests.Session()
session.headers.update({'User-Agent': 'Mozilla/5.0...'})
Picking Random User Agents
Using the same user agent for all your requests makes your traffic easy to detect. A better technique is to pick user agents randomly from a list to appear more human.
Start by compiling a list of various desktop and mobile user agents. You can easily find these online.
Then in your code, choose one randomly for each request:
import requests
import random
user_agents = ['Mozilla/5.0...',
'Mozilla/5.0...',
...]
user_agent = random.choice(user_agents)
headers = {'User-Agent': user_agent}
response = requests.get(url, headers=headers)
This makes every request look like it's coming from a different device and browser.
Beyond User Agent - Full Headers
Sophisticated bots look beyond just the user agent to determine if a request is automated. Real browsers send additional headers that identify the platform, accepted encodings, languages, and more.
We can leverage curl to copy all headers a real browser would send:
$ curl -v <https://www.website.com> 2>&1 | grep -i header
> GET / HTTP/1.1
> Host: www.website.com
> User-Agent: chrome
> Accept: text/html
> Accept-Language: en-US
Then insert these headers into your requests to appear more legitimate:
headers = {
'User-Agent': 'chrome',
'Accept': 'text/html',
'Accept-Language': 'en-US'
}
The Session Approach
Dealing with headers on every request can get tedious. Sessions allow us to set headers just once and have them applied to all requests from that session.
session = requests.Session()
session.headers = {
'User-Agent': 'chrome',
'Accept': 'text/html',
'Accept-Language': 'en-US'
}
response = session.get('<https://www.website.com>')
Much cleaner! And it keeps cookies between requests as well.
I like to create a Session with headers, cookies, and other settings configured for each site I'm scraping:
# Session for website A
session_a = requests.Session()
session_a.headers = {'User-Agent': 'chrome'}
# Session for website B
session_b = requests.Session()
session_b.headers = {'User-Agent': 'firefox'}
This approach makes it easy to customize each scraper.
Pro Tips and Tricks
Here are some pro tips I've picked up for mastering user agents with Python requests:
FAQ
What is the default user agent for Python Requests?
The default user agent for Python Requests is something like "python-requests/2.26.0". You can access it via
import requests
print(requests.utils.default_user_agent())
How do I change or set a custom user agent in Python Requests?
You can set a custom user agent by passing the
import requests
url = '<https://www.example.com>'
custom_user_agent = 'My User Agent'
response = requests.get(url, headers={'User-Agent': custom_user_agent})
How can I spoof a user agent in Python Requests?
To spoof a user agent like a browser, device or bot, simply set the
headers = {'User-Agent': 'Mozilla/5.0 (Linux; Android 8.0.0; SM-G930F) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/75.0.3770.101 Mobile Safari/537.36'}
response = requests.get(url, headers=headers)
What are some common user agents I can spoof in Python Requests?
Some common user agents to spoof:
How do I set a mobile user agent in Python Requests?
Use a user agent string from a mobile browser like Safari iOS or Chrome Android.
mobile_ua = 'Mozilla/5.0 (iPhone; CPU iPhone OS 13_3 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) CriOS/80.0.3987.95 Mobile/15E148 Safari/604.1'
What is the purpose of setting a user agent in Python Requests?
Setting a user agent can help mimic a browser or device to get past blocks on certain user agents. It also helps web servers identify the client.
What are best practices around spoofing user agents?
Avoid spoofing user agents for unethical purposes. Only modify user agents for testing or if required for access.
How do I set a browser user agent like Chrome or Firefox in Python Requests?
Use the browser's user agent string. For example:
firefox_ua = 'Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/110.0'
How can I detect and save user agents from requests using Python Requests?
The
import requests
response = requests.get(url)
user_agent = response.request.headers['User-Agent']
with open('user_agents.txt', 'a') as f:
f.write(user_agent + '\\n')
How do user agents work with authentication in Python Requests?
User agents are sent as normal headers even with authentication. Just add the
How can I use a proxy and set a user agent in Python Requests?
Pass the proxy URL to the
proxies = {'http': '<http://10.10.1.10:3128>'}
headers = {'User-Agent': 'Mozilla/5.0'...}
response = requests.get(url, proxies=proxies, headers=headers)
What is the difference between a user agent and other headers in Python Requests?
The user agent provides info about the client while other headers like