简体   繁体   English

棉花糖:将字符串数据库列映射到字典字段

[英]Marshmallow: mapping string db column to dict field

I have the following database table in SQL server:我在 SQL 服务器中有以下数据库表:

COLUMN_NAME              ORDINAL_POSITION DATA_TYPE       
----------------------------------------------------
id                       1 varchar
collection_id            2 varchar
created_at               3 datetimeoffset  
mimetype                 4 varchar
size                     5 int
file_hash                6 varchar
storageKey               7 varchar
extra_data               8 varchar

this is mapped to a Marshmallow Schema in flask app like so:这被映射到 flask 应用程序中的 Marshmallow Schema,如下所示:

class ImageDisplaySchema(Schema):
    id = fields.Str()
    created_at = fields.Str()
    collection_id = fields.Str()
    mimetype = fields.Str()
    size = fields.Int()
    extra_data = fields.Dict()

I'm struggling to find the right pre_dump / pre_load helpers to serialize/deserialize the extra_data column which is saved in the database as a json string.我正在努力寻找合适的pre_dump / pre_load助手来序列化/反序列化extra_data列,该列作为 json 字符串保存在数据库中。 I've tried a few variations but always end up with a serialization exception.我尝试了一些变体,但总是以序列化异常结束。 Here's my current version of the pre_dump helper:这是我当前版本的pre_dump助手:

    @pre_dump
    def serialize_extra_data(self, data, many):
        """This will alter the data passed to ``dump()`` before Marshmallow
        attempts serialization.
        """
        print(type(data), type(data.extra_data), data.extra_data)
        extra_data = data.extra_data
        data.extra_data = json.loads(extra_data)
        return data

But this attempts to push a dictionary rather than a string into the DB:但这试图将字典而不是字符串推送到数据库中:

cartridgeocr-annotations-1           |   File "src/pymssql/_mssql.pyx", line 1976, in pymssql._mssql._quote_or_flatten
cartridgeocr-annotations-1           | ValueError: expected a simple type, a tuple or a list

On the other hand if I change the loads to a dumps , some other serialization point complains that I don't have a dictionary in the field:另一方面,如果我将loads更改为dumps ,其他一些序列化点会抱怨我在该字段中没有字典:

cartridgeocr-annotations-1           | ValueError: dictionary update sequence element #0 has length 1; 2 is required

The latter exception occurs when the flask app attempts to dump the object in a query response:当 flask 应用程序尝试在查询响应中转储 object 时,会发生后一个异常:

 return schemas.ImageDisplaySchema().dump(image_in_db), 201

I'm looking for a working example of how to seamlessly convert the db string to a dictionary and back via Marshmallow我正在寻找一个工作示例,说明如何将 db 字符串无缝转换为字典并通过 Marshmallow 返回

When using SQLAlchemy, I usually return a model object as an output of the load of the marshmallow schema by using post_load .使用 SQLAlchemy 时,我通常使用post_load返回 model object 作为棉花糖模式负载的 output。 So the code could like this:所以代码可能是这样的:

import json
from datetime import datetime
from dataclasses import dataclass

from marshmallow import Schema, fields, pre_dump, post_load


@dataclass
class DbModel:
    id: str
    collection_id: str
    created_at: datetime
    mimetype: str
    size: int
    file_hash: str
    storage_key: str
    extra_data: str


class ImageDisplaySchema(Schema):
    id = fields.Str()
    created_at = fields.Str()
    collection_id = fields.Str()
    mimetype = fields.Str()
    size = fields.Int()
    extra_data = fields.Dict()

    @pre_dump
    def serialize_extra_data(self, data, many):
        """This will alter the data passed to ``dump()`` before Marshmallow
        attempts serialization.
        """
        print(type(data), type(data.extra_data), data.extra_data)
        extra_data = data.extra_data
        data.extra_data = json.loads(extra_data)
        return data
    
    @post_load
    def deserialize_to_model(self, data, **kwargs):
        # dumps : dict => str
        extra_data = json.dumps(data.pop("extra_data"))
        return DbModel(
            **data,
            extra_data=extra_data,
            file_hash=None,
            storage_key=None
        )


db = DbModel("id", "collection_id", datetime.now(), "mimetype", 1, "hash", "storage", '{"extra": "data", "key": 1}')

# dump : object => dict
dumped = ImageDisplaySchema().dump(db)
assert isinstance(dumped["extra_data"], dict)

# load : dict => object
to_db = ImageDisplaySchema().load(dumped)
assert isinstance(to_db.extra_data, str)

Side note:边注:

On the other hand if I change the loads to a dumps, some other serialization point complains that I don't have a dictionary in the field另一方面,如果我将负载更改为转储,其他一些序列化点会抱怨我在该字段中没有字典

As per the documentation of json stdlib:根据 json stdlib 的文档:
dumps转储
Serialize obj to a JSON formatted str using this conversion table.使用此转换表将 obj 序列化为 JSON 格式的 str。 The arguments have the same meaning as in dump(). arguments 与 dump() 中的含义相同。

loads负载
Deserialize s (a str, bytes or bytearray instance containing a JSON document) to a Python object using this conversion table.使用此转换表将 s(包含 JSON 文档的 str、bytes 或 bytearray 实例)反序列化为 Python object。

So doing a json.dumps on a str doesn't work as the function expects a dict .因此,在str上执行json.dumps不起作用,因为 function 需要dict

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM