简体   繁体   English

在 Python 请求缓存中将响应存储为 json

[英]Storing response as json in Python requests-cache

I'm using requests-cache to cache http responses in human-readable format.我正在使用请求缓存以人类可读的格式缓存 http 响应。 I've patched requests using the filesystem backend, and the the serializer to json , like so:我已经使用filesystem后端修补了requests ,并将序列化器修补为json ,如下所示:

import requests_cache
requests_cache.install_cache('example_cache', backend='filesystem', serializer='json')

The responses do get cached as json, but the response's body is encoded (I guess using the cattrs library, as described here ).响应确实缓存为 json,但响应的正文已编码(我猜使用cattrs库,如此所述)。

Is there a way to make requests-cache save responses as-is?有没有办法让requests-cache按原样保存响应?

What you want to do makes sense, but it's a bit more complicated than it appears.你想做的事情是有道理的,但它比看起来要复杂一些。 The response files you see are representations of requests.Response objects.您看到的响应文件是requests.Response对象的表示。 Response._content contains the original bytes received from the server. Response._content包含从服务器接收到的原始字节。 The wrapper methods and properties like Response.json() and Response.text will then attempt to decode that content.然后,诸如Response.json()Response.text之类的包装器方法和属性将尝试解码该内容。 For a Response object to work correctly, it needs to have the original binary response body.要使Response object 正常工作,它需要具有原始二进制响应主体。

When requests-cache serializes that response as JSON, the binary content is encoded in Base85.当请求缓存将该响应序列化为 JSON 时,二进制内容以 Base85 编码。 That's why you're seeing encoded bytes instead of JSON there.这就是为什么您在那里看到编码字节而不是 JSON 的原因。 To have everything including the response body saved in JSON, there are a couple options:要将包括响应主体在内的所有内容保存在 JSON 中,有几个选项:

Option 1选项1

Make a custom serializer .制作自定义序列化程序 If you wanted to be able to modify response content and have those changes reflected in responses returned by requests-cache, this would probably be the best way to do it.如果您希望能够修改响应内容并将这些更改反映在请求缓存返回的响应中,这可能是最好的方法。

This may be become a bit convoluted, because you would have to:这可能会变得有点复杂,因为您必须:

  1. Handle response content that isn't valid JSON, and save as encoded bytes instead处理无效的响应内容 JSON,并另存为编码字节
  2. During deserialization, if the content was saved as JSON, convert it back into bytes to recreate the original Response object在反序列化过程中,如果内容保存为 JSON,将其转换回字节以重新创建原始Response object

It's doable, though.不过,这是可行的。 I could try to come up with an example later, if needed.如果需要,我可以稍后尝试提出一个例子。

Option 2选项 2

Make a custom backend .制作自定义后端 It could extend FileCache and FileDict , and copy valid JSON content to a separate file.它可以扩展FileCacheFileDict ,并将有效的 JSON 内容复制到一个单独的文件中。 Here is a working example:这是一个工作示例:

import json
from os.path import splitext

from requests import Response
from requests_cache import CachedSession, FileCache, FileDict


class JSONFileCache(FileCache):
    """Filesystem backend that copies JSON-formatted response content into a separate file
    alongside the main response file
    """
    def __init__(self, cache_name, **kwargs):
        super().__init__(cache_name, **kwargs)
        self.responses = JSONFileDict(cache_name, **kwargs)



class JSONFileDict(FileDict):
    def __setitem__(self, key: str, value: Response):
        super().__setitem__(key, value)
        response_path = splitext(self._path(key))[0]
        json_path = f'{response_path}_content.json'

        # Will handle errors and skip writing if content can't be decoded as JSON
        with self._try_io(ignore_errors=True):
            content = json.dumps(value.json(), indent=2)
            with open(json_path, mode='w') as f:
                f.write(content)

Usage example:使用示例:

custom_backend = JSONFileCache('example_cache', serializer='json')
session = CachedSession(backend=custom_backend)
session.get('https://httpbin.org/get')

After making a request, you will see a pair of files like:发出请求后,您将看到一对文件,例如:

example_cache/680f2a52944ee079.json
example_cache/680f2a52944ee079_content.json

That may not be exactly what you want, but it's the easiest option if you only need to read the response content and don't need to modify it.这可能不是你想要的,但如果你只需要阅读响应内容而不需要修改它,这是最简单的选择。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM