Advanced Usage

This section covers some more advanced and use-case-specific features.

Custom Response Filtering

If you need more advanced behavior for determining what to cache, you can provide a custom filtering function via the filter_fn param. This can by any function that takes a requests.Response object and returns a boolean indicating whether or not that response should be cached. It will be applied to both new responses (on write) and previously cached responses (on read). Example:

>>> from sys import getsizeof
>>> from requests_cache import CachedSession
>>>
>>> def filter_by_size(response):
>>>     """Don't cache responses with a body over 1 MB"""
>>>     return getsizeof(response.content) <= 1024 * 1024
>>>
>>>    session = CachedSession(filter_fn=filter_by_size)

Cache Inspection

Here are some ways to get additional information out of the cache session, backend, and responses:

Response Attributes

The following attributes are available on responses: * from_cache: indicates if the response came from the cache * created_at: datetime of when the cached response was created or last updated * expires: datetime after which the cached response will expire * is_expired: indicates if the cached response is expired (if an old response was returned due to a request error)

Examples:

>>> from requests_cache import CachedSession
>>> session = CachedSession(expire_after=timedelta(days=1))
>>> # Placeholders are added for non-cached responses
>>> r = session.get('http://httpbin.org/get')
>>> print(r.from_cache, r.created_at, r.expires, r.is_expired)
False None None None
>>> # Values will be populated for cached responses
>>> r = session.get('http://httpbin.org/get')
>>> print(r.from_cache, r.created_at, r.expires, r.is_expired)
True 2021-01-01 18:00:00 2021-01-02 18:00:00 False

Cache Contents

You can use CachedSession.cache.urls() to see all URLs currently in the cache:

>>> session = CachedSession()
>>> print(session.cache.urls)
['https://httpbin.org/get', 'https://httpbin.org/stream/100']

If needed, you can get more details on cached responses via CachedSession.cache.responses, which is a dict-like interface to the cache backend. See CachedResponse for a full list of attributes available.

For example, if you wanted to to see all URLs requested with a specific method:

>>> post_urls = [
>>>     response.url for response in session.cache.responses.values()
>>>     if response.request.method == 'POST'
>>> ]

You can also inspect CachedSession.cache.redirects, which maps redirect URLs to keys of the responses they redirect to.

Custom Backends

If the built-in Cache Backends don’t suit your needs, you can create your own by making subclasses of BaseCache and BaseStorage:

>>> from requests_cache import CachedSession
>>> from requests_cache.backends import BaseCache, BaseStorage
>>>
>>> class CustomCache(BaseCache):
...     """Wrapper for higher-level cache operations. In most cases, the only thing you need
...     to specify here is which storage class(es) to use.
...     """
...     def __init__(self, **kwargs):
...         super().__init__(**kwargs)
...         self.redirects = CustomStorage(**kwargs)
...         self.responses = CustomStorage(**kwargs)
>>>
>>> class CustomStorage(BaseStorage):
...     """Dict-like interface for lower-level backend storage operations"""
...     def __init__(self, **kwargs):
...         super().__init__(**kwargs)
...
...     def __getitem__(self, key):
...         pass
...
...     def __setitem__(self, key, value):
...         pass
...
...     def __delitem__(self, key):
...         pass
...
...     def __iter__(self):
...         pass
...
...     def __len__(self):
...         pass
...
...     def clear(self):
...         pass

You can then use your custom backend in a CachedSession with the backend parameter:

>>> session = CachedSession(backend=CustomCache())

Usage with other requests features

Request Hooks

Requests has an Event Hook system that can be used to add custom behavior into different parts of the request process. It can be used, for example, for request throttling:

>>> import time
>>> import requests
>>> from requests_cache import CachedSession
>>>
>>> def make_throttle_hook(timeout=1.0):
>>>     """Make a request hook function that adds a custom delay for non-cached requests"""
>>>     def hook(response, *args, **kwargs):
>>>         if not getattr(response, 'from_cache', False):
>>>             print('sleeping')
>>>             time.sleep(timeout)
>>>         return response
>>>     return hook
>>>
>>> session = CachedSession()
>>> session.hooks['response'].append(make_throttle_hook(0.1))
>>> # The first (real) request will have an added delay
>>> session.get('http://httpbin.org/get')
>>> session.get('http://httpbin.org/get')

Streaming Requests

If you use streaming requests, you can use the same code to iterate over both cached and non-cached requests. A cached request will, of course, have already been read, but will use a file-like object containing the content. Example:

>>> from requests_cache import CachedSession
>>>
>>> session = CachedSession()
>>> for i in range(2):
...     r = session.get('https://httpbin.org/stream/20', stream=True)
...     for chunk in r.iter_lines():
...         print(chunk.decode('utf-8'))

Usage with other requests-based libraries

This library works by patching and/or extending requests.Session. Many other libraries out there do the same thing, making it potentially difficult to combine them. For that scenario, a mixin class is provided, so you can create a custom class with behavior from multiple Session-modifying libraries:

>>> from requests import Session
>>> from requests_cache import CacheMixin
>>> from some_other_lib import SomeOtherMixin
>>>
>>> class CustomSession(CacheMixin, SomeOtherMixin ClientSession):
...     """Session class with features from both requests-html and requests-cache"""

Requests-HTML

Example with requests-html:

>>> import requests
>>> from requests_cache import CacheMixin, install_cache
>>> from requests_html import HTMLSession
>>>
>>> class CachedHTMLSession(CacheMixin, HTMLSession):
...     """Session with features from both CachedSession and HTMLSession"""
>>>
>>> session = CachedHTMLSession()
>>> r = session.get('https://github.com/')
>>> print(r.from_cache, r.html.links)

Or, using the monkey-patch method:

>>> install_cache(session_factory=CachedHTMLSession)
>>> r = requests.get('https://github.com/')
>>> print(r.from_cache, r.html.links)

The same approach can be used with other libraries that subclass requests.Session.

Requests-futures

Example with requests-futures:

Some libraries, including requests-futures, support wrapping an existing session object:

>>> session = FutureSession(session=CachedSession())

In this case, FutureSession must wrap CachedSession rather than the other way around, since FutureSession returns (as you might expect) futures rather than response objects. See issue #135 for more notes on this.

Requests-mock

Example with requests-mock:

Requests-mock works a bit differently. It has multiple methods of mocking requests, and the method most compatible with requests-cache is attaching its adapter to a CachedSession:

>>> import requests
>>> from requests_mock import Adapter
>>> from requests_cache import CachedSession
>>>
>>> # Set up a CachedSession that will make mock requests where it would normally make real requests
>>> adapter = Adapter()
>>> adapter.register_uri(
...     'GET',
...     'mock://some_test_url',
...     headers={'Content-Type': 'text/plain'},
...     text='mock response',
...     status_code=200,
... )
>>> session = CachedSession()
>>> session.mount('mock://', adapter)
>>>
>>> session.get('mock://some_test_url', text='mock_response')
>>> response = session.get('mock://some_test_url')
>>> print(response.text)

Internet Archive

Example with internetarchive:

Usage is the same as other libraries that subclass requests.Session:

>>> from requests_cache import CacheMixin
>>> from internetarchive.session import ArchiveSession
>>>
>>> class CachedArchiveSession(CacheMixin, ArchiveSession):
...     """Session with features from both CachedSession and ArchiveSession"""