Understanding and manipulating URLs is crucial for many Python programs that work with the web. The urllib.parse module provides useful functions for parsing, composing, and manipulating URLs in your Python code.
The Pieces of a URL
A URL like
The
Parsing URLs
The
from urllib.parse import urlparse
url = 'https://www.example.com/path/to/page?key1=value1&key2=value2#Somewhere'
parsed = urlparse(url)
print(parsed.scheme) # https
print(parsed.netloc) # www.example.com
print(parsed.path) # /path/to/page
print(parsed.query) # key1=value1&key2=value2
print(parsed.fragment) # Somewhere
There are also convenience methods like
Composing and Joining URLs
You can also compose or reconstruct a URL from its parsed components using
from urllib.parse import urlunparse
data = ['https', 'www.example.com', '/path/to/page', None, 'key1=value1&key2=value2', 'Somewhere']
print(urlunparse(data))
# https://www.example.com/path/to/page?key1=value1&key2=value2#Somewhere
This allows modifying URLs by pieces programmatically.
The