Retrying Failed Requests in Python Requests (with Code Examples!)

Networks are unreliable. Servers fail. APIs go down temporarily. As developers, we've all experienced the frustration of HTTP requests failing at the worst possible time. Just when you need to process time-sensitive data or charge a customer, the API you rely on has an outage.

But with a robust retry mechanism, these inevitable failures don't have to ruin your application. By automatically retrying requests, you can vastly improve reliability despite flakey networks and services.

In this comprehensive guide, you'll learn how to retry failed requests in Python using the excellent Requests library. We'll cover:

The different types of request failures

Implementing retry logic with Sessions and HTTPAdapter

Building a custom retry wrapper from scratch

Configuring retries and delays

Advanced retry strategies

Special considerations for different HTTP methods like POST

Insider tips to avoid common pitfalls

To demonstrate each concept, we'll use practical code examples from real-world scenarios. By the end, you'll be able to incorporate robust request retries to handle errors and build resilient applications in Python. Let's get started!

Why Retry Failed Requests?

First, it helps to understand why retries are so crucial when working with remote APIs and services.

Distributed systems fail in complex ways. Here are just some common issues that can happen:

Network errors - the route to the server goes down temporarily

Overloaded servers - too many requests flood the API server

Timeouts - the server takes too long to process the request

5xx errors - the server encounters an internal error

429 rate limiting - you've hit a rate limit and been throttled

These failures occur frequently, especially when relying on external APIs. But in many cases, the issue is transient.

For example, let's say you're a ridesharing company that uses a payments API to charge customers. But when a huge processing load hits your servers during peak hours, the payment API starts timing out.

Without retries, you'll start seeing failed payments and angry customers! But if you retry the charge requests, there's a good chance the timeout was a temporary blip that will succeed on retry.

In short, retrying failed requests provides fault tolerance against the inherent unreliability of distributed systems. This prevents transient errors from affecting your application and improves reliability immensely.

Categorizing Request Failure Scenarios

To implement request retries effectively, you first need to understand the various types of failures that can occur. This allows you to customize your retry behavior accordingly.

There are two major categories:

Network Errors

These occur when the HTTP client cannot establish a connection to the server in the first place. Some examples include:

DNS lookup failures - the domain name can't be resolved to an IP address

Connection refusals - the server rejects the connection

Connection timeouts - the connection takes too long to establish

Often retrying on network errors is safe, as connectivity issues are usually intermittent.

HTTP Errors

Once a connection is established, the server may still return an HTTP error response:

4xx client errors - invalid request, authorization failure, etc

5xx server errors - internal server error, gateway timeout, etc

429 Too Many Requests - rate limiting threshold exceeded

5xx errors and 429 rate limiting specifically indicate transient server problems where a retry is appropriate.

However, for some 4xx client errors like 401 Unauthorized, retrying will likely fail again. So you may want to avoid retrying certain 4xx errors. We'll look at how to do this later.

Now that we understand the common failure scenarios, let's look at implementing robust retries in Python Requests!

Configuring Retries in Requests

The Requests library provides several options for retrying failed requests automatically.

The two main approaches are:

Using Sessions with a HTTPAdapter
Building your own custom retry wrapper

Let's explore these in detail.

Using Sessions and HTTPAdapter

Requests has the concept of a Session - a container that stores common settings across multiple requests.

This allows us to configure a retry strategy once that applies to all requests using that Session.

Here's a simple example:

import requests
from requests.adapters import HTTPAdapter
from urllib3.util import Retry

session = requests.Session()

retries = Retry(total=5,
                backoff_factor=1,
                status_forcelist=[502, 503, 504])

session.mount('https://', HTTPAdapter(max_retries=retries))

We create a Retry object that defines our retry strategy:

total - Max number of retries (5)

backoff_factor - Sleep time increases exponentially between retries (defaults to 0)

status_forcelist - Retry on server errors like 5xx (502 Bad Gateway, 503 Service Unavailable, 504 Gateway Timeout)

Next, we create an HTTPAdapter and mount it to the session. This applies our retry strategy to all requests made through this session.

Now any calls using this session will automatically retry up to 5 times on those 5xx errors:

response = session.get('<https://api.example.com/data>')

The HTTPAdapter approach is great because it's simple and built-in. However, we don't have much control over the detailed retry logic. For that, a custom wrapper is better.

Building a Custom Retry Wrapper

To implement a retry wrapper, we'll create a new get_with_retries() method that wraps the usual requests.get() call.

Here's a basic example:

MAX_RETRIES = 5

def get_with_retries(url):
    for i in range(MAX_RETRIES):
        try:
            response = requests.get(url)

            # Exit if no error
            if response.status_code == 200:
                return response

        except RequestException:
            print(f'Request failed, retrying {i+1}/{MAX_RETRIES} times...')

    return response

We try making the request up to 5 times inside the for loop. If any RequestException occurs, we catch it and retry.

This gives us complete control over the retry logic. Later we'll see how to customize it further.

The retry wrapper approach requires more code, but allows handling edge cases the HTTPAdapter way cannot. So they each have their niche uses.

Now let's look at how to configure retries for robustness.

Defining Your Retry Strategy

Simply enabling retries isn't enough - we need to tune them for our specific use case. Here are some key parameters to consider:

Number of Retries

How many times should a failed request be retried before giving up?

This depends on the API and type of failures expected. For transient server issues, 3-5 retries is usually sufficient. But a lower number like 2-3 prevents endless retries.

# Retry up to 3 times
retries = Retry(total=3)

For more permanent errors like 400 Bad Request, retrying is pointless so you may want total=0.

Tuning this parameter balances reliability vs. not retrying excessively. Start low and increase as needed.

Delay Between Retries

To avoid hammering a server, it's good to add a delay between retries:

# Wait 1 second between retries
retries = Retry(backoff_factor=1)

The backoff_factor provides exponential backoff - the retry delay will grow exponentially on each try.

The actual delay is calculated as:

{backoff factor} * (2 ** ({number of retries} - 1))

# With backoff_factor = 1
# Retries will wait: 1, 2, 4, 8, 16, 32, etc. seconds

Starting with small backoff factors (0.1-1) prevents waiting too long. The delays will increase as retries continue to fail.

Backoff Algorithms

Two advanced backoff strategies are worth mentioning:

Jitter - Add random jitter to the backoff delays. This helps when making parallel requests by avoiding synchronized retries.

Rate limiting - Monitor recent request rates and backoff dynamically based on the rate limit. This prevents banging your head against a rate limit.

These require custom wrappers to implement. But they can make your retries smarter.

Status Codes to Retry

By default, retries only happen on network errors before the request reaches the server. To retry on certain HTTP status codes, specify the status_forcelist:

retries = Retry(total=3,
                status_forcelist=[502, 503, 504])

Common transient server issues like 502, 503, and 504 are good candidates to retry.

429 Too Many Requests can indicate hitting a rate limit - retrying with backoff is recommended.

Some 4xx client errors may be retry-worthy depending on context. But avoid retrying errors like 400 Bad Request which are client mistakes.

Dealing with Timeouts

Network timeouts are a special case - once a request times out, the server may still be processing the original request. Simply retrying could cause duplicate effects.

There are a couple ways to handle this:

For idempotent requests, retry timeouts immediately

For non-idempotent requests, wait before retrying with an exponential backoff

Idempotent requests (GET, PUT, DELETE) are safe to retry, but POST and others could duplicate. The backoff gives time for the original to finish before retrying.

We'll discuss idempotency more later. But dealing with timeouts properly ensures duplicate requests don't happen.

Advanced Retry Customization

For more control, you can get creative with your custom retry wrapper. Here are some advanced strategies:

Retry Conditions

Instead of a simple catch-all except, you can specify certain conditions to trigger a retry.

For example, to retry on a 422 status code:

if response.status_code == 422:
  retry()

This allows retrying on domain-specific transient failures beyond just 5xx.

Conditional Retries

Certain responses may contain indicators that a retry is or isn't recommended:

if 'shouldRetry' in response.headers:
  retry()

if 'doNotRetry' in response.json():
  return response

This gives you added flexibility beyond just status codes.

Request Hooks

Hooks allow attaching callbacks to different stages of a request.

We can use them to log retries or increment counters:

def retry_hook(response, *args, **kwargs):
  print(f'Retrying {response.request.url}...')

s.hooks['response'] = [retry_hook]

This helps debug retries and gives visibility into how often they occur.

Exponential Backoff

Instead of fixed delays, backing off exponentially prevents hitting rate limits:

def get_backoff_time(attempts):
  delay = 2 ** attempts # exponential backoff
  return delay

time.sleep(get_backoff_time(retries))

Starting with 2 seconds, this waits 2, 4, 8, 16, 32... seconds between retries.

Jittered Retries

Adding a random "jitter" factor prevents multiple clients synchronizing retries:

@random.uniform(0.5, 1.5)
def get_jittered_backoff(attempts):
  delay = 2 ** attempts
  return delay * jitter_factor

time.sleep(get_jittered_backoff(retries))

This avoids sudden spikes of retries from many clients.

Handling Different Request Types

Retrying gets nuanced when we consider different HTTP request methods like POST, PUT, and DELETE.

Retrying GET Requests

GET requests are read-only and idempotent. This means they're safe to retry without side effects - each retry is identical to the first.

So it's usually fine to retry GETs without any special handling. Just watch for infinite loops due to programming errors.

Retrying POST Requests

Retrying POST requests can be dangerous - a duplicate POST may create duplicate resources.

To safely retry POSTs:

Check for idempotency keys in the request

Add delays between retries

Avoid retrying 4xx client errors

Idempotency keys are a technique where the client adds a unique key to ensure duplicate requests can be detected.

Adding a backoff delay also prevents duplicates by allowing time for the original to finish before retrying.

And 4xx errors like 400 Bad Request usually indicate a client mistake that warrants investigation before retrying blindly.

With proper caution, POSTs can usually be retried safely.

Retrying PUT/PATCH Requests

PUT and PATCH requests to update resources are also non-idempotent. Similar precautions should be taken as POST:

Use idempotency keys

Add delays between retries

Avoid retrying 4xx errors

For example, a common pattern is to retry PUTs and PATCHes on 5xx errors but not 4xx.

Again taking care to prevent duplicates, these request types can also be retried in most cases.

Retrying DELETE Requests

DELETE requests can be retried safely as they are idempotent - a duplicate DELETE causes no harm.

The main risk is deleting something unintentionally if the original DELETE succeeded but the retry succeeded.

Checking the response status before retrying avoids this edge case:

if response.status_code not in [200, 204]:
  retry_delete(response)

With status code checking, DELETEs can usually be retried freely.

Avoiding Problems with Retries

While powerful, misusing retries can also cause problems. Here are some pitfalls to avoid:

Infinite Retries

If there's a programming bug that causes each retry to fail, you can wind up in an infinite loop. Use a conservative max_retries and increment a counter to prevent this.

Overloading APIs

Too many rapid retries may overload APIs and worsen outages. Exponential backoff helps by adding delays between retries.

Duplicate Requests

As discussed above, take care to avoid duplicating side effects when retrying PUT, POST, DELETE requests.

Blocking Traffic

If traffic is blocked by a firewall misconfiguration, retrying endlessly won't fix it. At some point you'll want to fail and alert developers.

Disregarding Rate Limits

Ignoring 429 Too Many Requests and blindly retrying may completely block access. Dynamic rate limit monitoring helps, as discussed earlier.

Best Practices for Production

Here are some recommended best practices when implementing request retries:

Start with conservative limits like 2-3 retries maximum

Handle429 rate limiting errors with backoff

Use exponential backoff and jitter between retries

Be very cautious with POST/PUT/DELETE retries

Implement request hooks to log and monitor retries

Set overall request timeouts along with retries

Consider dynamic rate limit detection

Flag excessive retries and alert developers

Thoroughly test edge cases before deploying

Following these will help avoid common pitfalls and ensure your retries improve reliability.

Common Mistakes to Avoid

To wrap up, here are some common mistakes that can undermine your retry logic:

Retrying endlessly without limits

Not adding any delays between retries

Blindly retrying all error codes without checking

Retrying 404 Not Found errors pointlessly

Failing to handle POST/PUT/DELETE retries carefully

Forgetting overall request timeouts in addition to retries

Disabling retries in production due to edge case bugs

Carefully avoiding these blunders will set you up for success with request retries!

Retrying Failed Requests in Python Requests (with Code Examples!)

Why Retry Failed Requests?

Categorizing Request Failure Scenarios

Network Errors

HTTP Errors

Configuring Retries in Requests

Using Sessions and HTTPAdapter

Building a Custom Retry Wrapper

Defining Your Retry Strategy

Number of Retries

Delay Between Retries

Backoff Algorithms

Status Codes to Retry

Dealing with Timeouts

Advanced Retry Customization

Retry Conditions

Conditional Retries

Request Hooks

Exponential Backoff

Jittered Retries

Handling Different Request Types

Retrying GET Requests

Retrying POST Requests

Retrying PUT/PATCH Requests

Retrying DELETE Requests

Avoiding Problems with Retries

Best Practices for Production

Common Mistakes to Avoid

Browse by tags:

Browse by language:

The easiest way to do Web Scraping

Retrying Failed Requests in Python Requests (with Code Examples!)

Why Retry Failed Requests?

Categorizing Request Failure Scenarios

Network Errors

HTTP Errors

Configuring Retries in Requests

Using Sessions and HTTPAdapter

Building a Custom Retry Wrapper

Defining Your Retry Strategy

Number of Retries

Delay Between Retries

Backoff Algorithms

Status Codes to Retry

Dealing with Timeouts

Advanced Retry Customization

Retry Conditions

Conditional Retries

Request Hooks

Exponential Backoff

Jittered Retries

Handling Different Request Types

Retrying GET Requests

Retrying POST Requests

Retrying PUT/PATCH Requests

Retrying DELETE Requests

Avoiding Problems with Retries

Best Practices for Production

Common Mistakes to Avoid

The easiest way to do Web Scraping

Don't leave just yet!