Accessing Protected Resources with urllib and Realm Authentication

When accessing protected web resources using the Python urllib module, you may encounter a "401 Unauthorized" error indicating that realm-based authentication is required. Realm authentication protects parts of a web application and prompts the user for credentials when accessing those areas.

To gain access, urllib provides the HTTPPasswordMgrWithDefaultRealm class to handle sending credentials. Here is an example fetching a protected resource:

import urllib.request

username = 'myusername'
password = 'mypassword' 

password_mgr = urllib.request.HTTPPasswordMgrWithDefaultRealm()
password_mgr.add_password(None, "https://example.com/api", username, password)

handler = urllib.request.HTTPBasicAuthHandler(password_mgr)
opener = urllib.request.build_opener(handler)

urllib.request.install_opener(opener)

response = urllib.request.urlopen('https://example.com/api/protected-resource')
print(response.read())

The key steps are:

Create a HTTPPasswordMgrWithDefaultRealm to store credentials
Add the username and password for the realm using add_password()
Create a HTTPBasicAuthHandler using the password manager
Build an opener using the handler to preemptively send credentials
Install the opener as the default opener in urllib

Now any requests will automatically send credentials if a 401 realm challenge is encountered.

Some tips:

The realm is usually, but not always, the root URL path

You can specify the exact realm if known using add_password(realm, uri,...)

Using an opener allows transparent handling of authentication

To summarize, urllib provides the capability to access protected resources using realm-based authentication via the HTTPPasswordMgrWithDefaultRealm and HTTPBasicAuthHandler classes. Configuring these correctly takes a bit of trial-and-error to match the expected realm behavior of the server.

Accessing Protected Resources with urllib and Realm Authentication

Browse by tags:

Browse by language:

The easiest way to do Web Scraping

Accessing Protected Resources with urllib and Realm Authentication

The easiest way to do Web Scraping

Don't leave just yet!