Introduction
The requests module is a popular Python library for sending HTTP requests and interacting with web APIs. But like any other code, you can run into errors - one common one being the MissingSchema error. In this guide, we'll understand what causes this error, and various ways to fix and handle it properly.
What is MissingSchema Error?
The
For example:
import requests
response = requests.get("www.example.com")
This will raise the error:
requests.exceptions.MissingSchema: Invalid URL 'www.example.com': No schema supplied.
Perhaps you meant '<http://www.example.com>'?
Python requests requires the protocol to understand how to connect to the URL.
When Does it Occur?
Some common cases when you may see this error:
So the key is the URL string being passed to requests is invalid or incomplete.
Impact of the Error
The
So it's important to fix and handle this error to make sure your program doesn't crash when encountering such URLs.
Causes of MissingSchema
To understand how to fix this error, let's first see some of the common causes in detail:
Forgetting the Protocol
The most common reason for this error is simply missing the
# Missing http://
url = "www.example.com"
This can happen accidentally while coding, especially when constructing URLs dynamically.
Using a Relative URL
Another reason is trying to use a relative URL like
For example:
response = requests.get("/products")
Relative URLs are shortcuts for linking internal pages but won't work directly with requests.
Invalid URL String
Sometimes the URL string may be constructed incorrectly or contain errors like missing dots:
url = "<http://examplecom>" # missing dot after example
This can happen when building URLs dynamically or via user input.
Fixing MissingSchema
Now let's see various ways to fix the
Check for Protocol in URL
The simplest fix is to ensure the URL contains
if not url.startswith("http"):
url = "http://" + url
Or better, use
from urllib.parse import urlparse
if not urlparse(url).scheme:
url = "http://" + url
Handle Relative URLs
To handle relative URLs, we need to combine it with the base URL of the site:
from urllib.parse import urljoin
base_url = "<http://example.com>"
relative_url = "/page/2"
full_url = urljoin(base_url, relative_url)
# <http://example.com/page/2>
The
Validate URL String
For user input URLs, we should validate it before sending the request:
from urllib.parse import urlparse
if not urlparse(url).netloc:
# Invalid URL, raise error or log warning
raise ValueError("Invalid URL")
This will catch invalid URLs and prevent the
Handling MissingSchema Gracefully
In addition to fixing the error, it's also important to handle it gracefully using exceptions and logging.
Try-Except Blocks
We can use try-except blocks to handle
import requests
from urllib.parse import urljoin
try:
response = requests.get(url)
except requests.exceptions.MissingSchema:
print("Invalid URL")
# or handle specific cases:
try:
response = requests.get(url)
except requests.exceptions.InvalidURL:
raise ValueError("Invalid URL")
except requests.exceptions.MissingSchema:
url = urljoin("http://", url)
response = requests.get(url)
This prevents the program from crashing and we can take appropriate actions.
Raise Custom Exceptions
We can define custom exceptions to raise errors on specific conditions:
class InvalidURLError(Exception):
pass
# Check URL with regex or urlparse
if not valid_url(url):
raise InvalidURLError("Invalid URL: "+url)
This allows us to notify callers of our API about invalid URLs.
Log Errors
Using the logging module, we can log
import logging
logger = logging.getLogger(__name__)
try:
response = requests.get(url)
except MissingSchema:
logger.error("Invalid URL: %s", url)
The logs can be output to console or a file for analysis.
Preventing MissingSchema Errors
Prevention is better than cure, so let's look at some best practices:
Validate User Input URL
If your code takes URL as user input, always validate it first:
from urllib.parse import urlparse
url = input("Enter URL: ")
if not urlparse(url).scheme:
print("Invalid URL")
exit()
This ensures malformed URLs are not passed to requests.
Use Relative URLs Cautiously
Avoid relative URLs as much as possible. If you need to use them, always join with base URL first.
Standardize URL Handling
Have a standard function to handle URL validation and processing before sending requests. This avoids duplicate code and reduces errors.
Best Practices
Here are some overall best practices for avoiding
Other Ways to Fix MissingSchema
Here are some other methods to try if you still face
Related Errors and Issues
Here are some other requests errors and issues that are good to know about:
Debugging techniques:
Conclusion
The