Cache Filtering

In many cases you will want to choose what you want to cache instead of just caching everything. By default, all read-only (GET and HEAD) requests with a 200 response code are cached. A few options are available to modify this behavior.

Note

When using CachedSession, any requests that you don’t want to cache can also be made with a regular requests.Session object, or wrapper functions like requests.get(), etc.

Cached HTTP Methods

To cache additional HTTP methods, specify them with allowable_methods:

>>> session = CachedSession(allowable_methods=('GET', 'POST'))
>>> session.post('http://httpbin.org/post', json={'param': 'value'})

For example, some APIs use the POST method to request data via a JSON-formatted request body, for requests that may exceed the max size of a GET request. You may also want to cache POST requests to ensure you don’t send the exact same data multiple times.

Cached Status Codes

To cache additional status codes, specify them with allowable_codes

>>> session = CachedSession(allowable_codes=(200, 418))
>>> session.get('http://httpbin.org/teapot')

Cached URLs

You can use URL patterns to define an allowlist for selective caching, by using a expiration value of 0 (or requests_cache.DO_NOT_CACHE, to be more explicit) for non-matching request URLs:

>>> from requests_cache import DO_NOT_CACHE, CachedSession
>>> urls_expire_after = {
...     '*.site_1.com': 30,
...     'site_2.com/static': -1,
...     '*': DO_NOT_CACHE,
... }
>>> session = CachedSession(urls_expire_after=urls_expire_after)

Note that the catch-all rule above ('*') will behave the same as setting the session-level expiration to 0:

>>> urls_expire_after = {'*.site_1.com': 30, 'site_2.com/static': -1}
>>> session = CachedSession(urls_expire_after=urls_expire_after, expire_after=0)

Custom Cache Filtering

If you need more advanced behavior for choosing what to cache, you can provide a custom filtering function via the filter_fn param. This can by any function that takes a requests.Response object and returns a boolean indicating whether or not that response should be cached. It will be applied to both new responses (on write) and previously cached responses (on read):

Example code

>>> from sys import getsizeof
>>> from requests_cache import CachedSession

>>> def filter_by_size(response: Response) -> bool:
>>>     """Don't cache responses with a body over 1 MB"""
>>>     return getsizeof(response.content) <= 1024 * 1024

>>> session = CachedSession(filter_fn=filter_by_size)

Note

filter_fn() will be used in addition to other filtering options.