Networks are unreliable. Servers fail. APIs go down temporarily. As developers, we've all experienced the frustration of HTTP requests failing at the worst possible time. Just when you need to process time-sensitive data or charge a customer, the API you rely on has an outage.
But with a robust retry mechanism, these inevitable failures don't have to ruin your application. By automatically retrying requests, you can vastly improve reliability despite flakey networks and services.
In this comprehensive guide, you'll learn how to retry failed requests in Python using the excellent Requests library. We'll cover:
To demonstrate each concept, we'll use practical code examples from real-world scenarios. By the end, you'll be able to incorporate robust request retries to handle errors and build resilient applications in Python. Let's get started!
Why Retry Failed Requests?
First, it helps to understand why retries are so crucial when working with remote APIs and services.
Distributed systems fail in complex ways. Here are just some common issues that can happen:
These failures occur frequently, especially when relying on external APIs. But in many cases, the issue is transient.
For example, let's say you're a ridesharing company that uses a payments API to charge customers. But when a huge processing load hits your servers during peak hours, the payment API starts timing out.
Without retries, you'll start seeing failed payments and angry customers! But if you retry the charge requests, there's a good chance the timeout was a temporary blip that will succeed on retry.
In short, retrying failed requests provides fault tolerance against the inherent unreliability of distributed systems. This prevents transient errors from affecting your application and improves reliability immensely.
Categorizing Request Failure Scenarios
To implement request retries effectively, you first need to understand the various types of failures that can occur. This allows you to customize your retry behavior accordingly.
There are two major categories:
Network Errors
These occur when the HTTP client cannot establish a connection to the server in the first place. Some examples include:
Often retrying on network errors is safe, as connectivity issues are usually intermittent.
HTTP Errors
Once a connection is established, the server may still return an HTTP error response:
5xx errors and 429 rate limiting specifically indicate transient server problems where a retry is appropriate.
However, for some 4xx client errors like 401 Unauthorized, retrying will likely fail again. So you may want to avoid retrying certain 4xx errors. We'll look at how to do this later.
Now that we understand the common failure scenarios, let's look at implementing robust retries in Python Requests!
Configuring Retries in Requests
The Requests library provides several options for retrying failed requests automatically.
The two main approaches are:
- Using Sessions with a
HTTPAdapter - Building your own custom retry wrapper
Let's explore these in detail.
Using Sessions and HTTPAdapter
Requests has the concept of a Session - a container that stores common settings across multiple requests.
This allows us to configure a retry strategy once that applies to all requests using that Session.
Here's a simple example:
import requests
from requests.adapters import HTTPAdapter
from urllib3.util import Retry
session = requests.Session()
retries = Retry(total=5,
backoff_factor=1,
status_forcelist=[502, 503, 504])
session.mount('https://', HTTPAdapter(max_retries=retries))
We create a
Next, we create an
Now any calls using this session will automatically retry up to 5 times on those 5xx errors:
response = session.get('<https://api.example.com/data>')
The
Building a Custom Retry Wrapper
To implement a retry wrapper, we'll create a new
Here's a basic example:
MAX_RETRIES = 5
def get_with_retries(url):
for i in range(MAX_RETRIES):
try:
response = requests.get(url)
# Exit if no error
if response.status_code == 200:
return response
except RequestException:
print(f'Request failed, retrying {i+1}/{MAX_RETRIES} times...')
return response
We try making the request up to 5 times inside the for loop. If any
This gives us complete control over the retry logic. Later we'll see how to customize it further.
The retry wrapper approach requires more code, but allows handling edge cases the
Now let's look at how to configure retries for robustness.
Defining Your Retry Strategy
Simply enabling retries isn't enough - we need to tune them for our specific use case. Here are some key parameters to consider:
Number of Retries
How many times should a failed request be retried before giving up?
This depends on the API and type of failures expected. For transient server issues, 3-5 retries is usually sufficient. But a lower number like 2-3 prevents endless retries.
# Retry up to 3 times
retries = Retry(total=3)
For more permanent errors like 400 Bad Request, retrying is pointless so you may want
Tuning this parameter balances reliability vs. not retrying excessively. Start low and increase as needed.
Delay Between Retries
To avoid hammering a server, it's good to add a delay between retries:
# Wait 1 second between retries
retries = Retry(backoff_factor=1)
The
The actual delay is calculated as:
{backoff factor} * (2 ** ({number of retries} - 1))
# With backoff_factor = 1
# Retries will wait: 1, 2, 4, 8, 16, 32, etc. seconds
Starting with small backoff factors (0.1-1) prevents waiting too long. The delays will increase as retries continue to fail.
Backoff Algorithms
Two advanced backoff strategies are worth mentioning:
These require custom wrappers to implement. But they can make your retries smarter.
Status Codes to Retry
By default, retries only happen on network errors before the request reaches the server. To retry on certain HTTP status codes, specify the
retries = Retry(total=3,
status_forcelist=[502, 503, 504])
Common transient server issues like 502, 503, and 504 are good candidates to retry.
429 Too Many Requests can indicate hitting a rate limit - retrying with backoff is recommended.
Some 4xx client errors may be retry-worthy depending on context. But avoid retrying errors like 400 Bad Request which are client mistakes.
Dealing with Timeouts
Network timeouts are a special case - once a request times out, the server may still be processing the original request. Simply retrying could cause duplicate effects.
There are a couple ways to handle this:
Idempotent requests (GET, PUT, DELETE) are safe to retry, but POST and others could duplicate. The backoff gives time for the original to finish before retrying.
We'll discuss idempotency more later. But dealing with timeouts properly ensures duplicate requests don't happen.
Advanced Retry Customization
For more control, you can get creative with your custom retry wrapper. Here are some advanced strategies:
Retry Conditions
Instead of a simple catch-all
For example, to retry on a 422 status code:
if response.status_code == 422:
retry()
This allows retrying on domain-specific transient failures beyond just 5xx.
Conditional Retries
Certain responses may contain indicators that a retry is or isn't recommended:
if 'shouldRetry' in response.headers:
retry()
if 'doNotRetry' in response.json():
return response
This gives you added flexibility beyond just status codes.
Request Hooks
Hooks allow attaching callbacks to different stages of a request.
We can use them to log retries or increment counters:
def retry_hook(response, *args, **kwargs):
print(f'Retrying {response.request.url}...')
s.hooks['response'] = [retry_hook]
This helps debug retries and gives visibility into how often they occur.
Exponential Backoff
Instead of fixed delays, backing off exponentially prevents hitting rate limits:
def get_backoff_time(attempts):
delay = 2 ** attempts # exponential backoff
return delay
time.sleep(get_backoff_time(retries))
Starting with 2 seconds, this waits 2, 4, 8, 16, 32... seconds between retries.
Jittered Retries
Adding a random "jitter" factor prevents multiple clients synchronizing retries:
@random.uniform(0.5, 1.5)
def get_jittered_backoff(attempts):
delay = 2 ** attempts
return delay * jitter_factor
time.sleep(get_jittered_backoff(retries))
This avoids sudden spikes of retries from many clients.
Handling Different Request Types
Retrying gets nuanced when we consider different HTTP request methods like POST, PUT, and DELETE.
Retrying GET Requests
GET requests are read-only and idempotent. This means they're safe to retry without side effects - each retry is identical to the first.
So it's usually fine to retry GETs without any special handling. Just watch for infinite loops due to programming errors.
Retrying POST Requests
Retrying POST requests can be dangerous - a duplicate POST may create duplicate resources.
To safely retry POSTs:
Idempotency keys are a technique where the client adds a unique key to ensure duplicate requests can be detected.
Adding a backoff delay also prevents duplicates by allowing time for the original to finish before retrying.
And 4xx errors like 400 Bad Request usually indicate a client mistake that warrants investigation before retrying blindly.
With proper caution, POSTs can usually be retried safely.
Retrying PUT/PATCH Requests
PUT and PATCH requests to update resources are also non-idempotent. Similar precautions should be taken as POST:
For example, a common pattern is to retry PUTs and PATCHes on 5xx errors but not 4xx.
Again taking care to prevent duplicates, these request types can also be retried in most cases.
Retrying DELETE Requests
DELETE requests can be retried safely as they are idempotent - a duplicate DELETE causes no harm.
The main risk is deleting something unintentionally if the original DELETE succeeded but the retry succeeded.
Checking the response status before retrying avoids this edge case:
if response.status_code not in [200, 204]:
retry_delete(response)
With status code checking, DELETEs can usually be retried freely.
Avoiding Problems with Retries
While powerful, misusing retries can also cause problems. Here are some pitfalls to avoid:
Infinite Retries
If there's a programming bug that causes each retry to fail, you can wind up in an infinite loop. Use a conservative
Overloading APIs
Too many rapid retries may overload APIs and worsen outages. Exponential backoff helps by adding delays between retries.
Duplicate Requests
As discussed above, take care to avoid duplicating side effects when retrying PUT, POST, DELETE requests.
Blocking Traffic
If traffic is blocked by a firewall misconfiguration, retrying endlessly won't fix it. At some point you'll want to fail and alert developers.
Disregarding Rate Limits
Ignoring 429 Too Many Requests and blindly retrying may completely block access. Dynamic rate limit monitoring helps, as discussed earlier.
Best Practices for Production
Here are some recommended best practices when implementing request retries:
Following these will help avoid common pitfalls and ensure your retries improve reliability.
Common Mistakes to Avoid
To wrap up, here are some common mistakes that can undermine your retry logic:
Carefully avoiding these blunders will set you up for success with request retries!