The urllib.request module in Python 3 provides a simple way to access and download data from websites via HTTP and HTTPS. The key function for this is urllib.request.urlopen(), which opens a handle to a URL that you can then read from or write to.
Opening and Reading URL Contents
To open a URL, just pass the URL string to
import urllib.request
with urllib.request.urlopen('http://example.com') as response:
html = response.read()
This opens the URL, gets a file-like handle in
You can also access header data, get the response status code, check if the site exists, and more:
import urllib.request
response = urllib.request.urlopen('http://python.org')
print(response.status) # 200 means success
print(response.getheaders()) # print headers
print(response.msg) # print status msg
Handling Errors
If a URL doesn't exist or there's another error,
import urllib.request
import urllib.error
try:
response = urllib.request.urlopen('http://wrong.url')
except urllib.error.HTTPError as e:
print(e.code) # print error code
print(e.read()) # print error response body
This prints out the HTTP status code and error message so you can handle issues cleanly.
Writing to Websites
You can also use
import urllib.parse
import urllib.request
url = 'http://www.example.com/search'
data = urllib.parse.urlencode({'q':'python'})
data = data.encode('utf-8')
req = urllib.request.Request(url, data)
with urllib.request.urlopen(req) as response:
print(response.read())
The
So in summary,