Making HTTP requests is a fundamental task in most Python applications. However, with the default Python installation, it can be more complicated than needed. In this article, we'll explore the progression of HTTP client libraries in Python - from the low-level urllib2 to the simplified requests. We'll compare their APIs and use cases to help you pick the right tool for your next Python project.
urllib2 - Python's Default HTTP Client
Up until Python 2.6,
For example, here's how to make a simple GET request with urllib2:
import urllib2
response = urllib2.urlopen('http://example.com')
html = response.read()
While full-featured, the API is clunky and involves error-prone boilerplate code even for basic operations:
import urllib2
request = urllib2.Request('http://example.com')
request.add_header('User-Agent', 'My Python App')
try:
response = urllib2.urlopen(request)
except urllib2.HTTPError as e:
print(e.code)
except urllib2.URLError as e:
print(e.reason)
else:
print(response.read())
So what are some downsides to using urllib2?
Overall, it requires significant effort to use urllib2 effectively for anything beyond trivial requests.
Enter urllib - A Minor Improvement
In Python 3,
Here's an example HTTP GET request with urllib:
from urllib import request
response = request.urlopen('http://example.com')
html = response.read()
While the API is slightly better, it still shares many of the same problems as urllib2:
So urllib is only a small incremental improvement over urllib2.
requests - A Simple Yet Powerful Library
Released in 2011, the
import requests
response = requests.get('http://example.com')
print(response.text)
Compared to urllib and urllib2, some key advantages of requests include:
Simplified API - Intuitive functions like
Built-in JSON support - Automatic JSON encoding/decoding with
Connection pooling/sessions - Underlying connection pooling and session handling.
Automatic error handling - HTTP errors raise clean Python exceptions instead.
Helper methods - Utilities like response headers/body, cookies, timeouts, retries.
Third-party support - Plays nicely with extensions like caching, proxies, authentication.
While requests doesn't handle as low-level details as urllib, it's perfect for most API usage. Due to its simplicity and features, requests has become the de facto standard for HTTP in Python.
Here's a more in-depth example with error handling, headers, and JSON data:
import requests
url = 'http://example.com/api'
headers = {'User-Agent': 'My App'}
try:
response = requests.get(url, headers=headers)
response.raise_for_status() # Check for 4xx/5xx status codes
except requests.exceptions.HTTPError as e:
print(e)
else:
data = response.json()
print(data)
So in summary, what are some good use cases for each library?
Requests is suitable for over 90% of HTTP usage in Python. However, urllib and urllib2 still have niche use cases for advanced low-level control.