使用Django模型將JSON數據寫入關系數據庫的最優雅方法？

Question

我有一個典型的關系數據庫模型，在Django中，典型的模型包含一些ForeignKeys ，一些ManyToManyFields ，以及一些擴展Django的DateTimeField字段。

我想保存我從外部api以JSON格式（不是平面）接收的數據。 我不希望數據被保存到各自的表（而不是整個json字符串到一個字段）。 這樣做最干凈，最簡單的方法是什么？ 是否有可用於使此任務更簡單的庫？

這是一個澄清我的問題的例子，

楷模-

class NinjaData(models.Model):
    id = models.IntegerField(primary_key=True, unique=True)
    name = models.CharField(max_length=60)  
    birthdatetime = MyDateTimeField(null=True)
    deathdatetime = MyDatetimeField(null=True)
    skills = models.ManyToManyField(Skills, null=True)
    weapons = models.ManyToManyField(Weapons, null=True)
    master = models.ForeignKey(Master, null=True)

class Skills(models.Model):
    id = models.IntegerField(primary_key=True, unique=True)
    name = models.CharField(max_length=60)
    difficulty = models.IntegerField(null=True)

class Weapons(models.Model):
    id = models.IntegerField(primary_key=True, unique=True)
    name = models.CharField(max_length=60)
    weight = models.FloatField(null=True)

class Master(models.Model):
    id = models.IntegerField(primary_key=True, unique=True)
    name = models.CharField(max_length=60)
    is_awesome = models.NullBooleanField()

現在，我通常要將從外部api（秘密忍者api）獲得的json字符串數據保存到此模型中，json看起來像這樣

JSON-

{
"id":"1234",
"name":"Hitori",
"birthdatetime":"11/05/1999 20:30:00",
"skills":[
    {
    "id":"3456",
    "name":"stealth",
    "difficulty":"2"
    },
    {
    "id":"678",
    "name":"karate",
    "difficulty":"1"
    }
],
"weapons":[
    {
    "id":"878",
    "name":"shuriken",
    "weight":"0.2"
    },
    {
    "id":"574",
    "name":"katana",
    "weight":"0.5"
    }
],
"master":{
    "id":"4",
    "name":"Schi fu",
    "is_awesome":"true"
    }
}

現在用於處理典型ManyToManyField的邏輯非常簡單，

邏輯代碼 -

data = json.loads(ninja_json)
ninja = NinjaData.objects.create(id=data['id'], name=data['name'])

if 'weapons' in data:
    weapons = data['weapons']
    for weapon in weapons:
        w = Weapons.objects.get_or_create(**weapon)  # create a new weapon in Weapon table
        ninja.weapons.add(w)

if 'skills' in data:
    ...
    (skipping rest of the code for brevity)

我可以使用很多方法，

view函數中的邏輯代碼，它執行將json轉換為模型實例的所有工作
代碼高於邏輯覆蓋模型的__init__方法
代碼高於邏輯覆蓋模型的save()方法
為每個模型創建一個Manager，並在其每個方法中編寫此邏輯，如create ， get_or_create ， filter等。
擴展ManyToManyField並將其放在那里，
外部圖書館？

我想知道是否有一種最明顯的方法可以將這種json形式的數據保存到數據庫而無需多次編寫上述邏輯，那么您建議的最優雅的方法是什么？

感謝大家閱讀長篇文章，

Answer 1

在我看來，您需要的代碼最干凈的地方是作為NinjaData模型的自定義管理器上的新Manager方法（例如from_json_string）。

我不認為你應該覆蓋標准的create，get_or_create等方法，因為你做的事情與他們通常做的有點不同，保持它們正常工作是件好事。

更新：我意識到我可能在某些時候想要這個，所以我已編碼並輕微測試了一個通用函數。 由於它遞歸地通過並影響其他模型，我不再確定它屬於一個Manager方法，應該是一個獨立的輔助函數。

def create_or_update_and_get(model_class, data):
    get_or_create_kwargs = {
        model_class._meta.pk.name: data.pop(model_class._meta.pk.name)
    }
    try:
        # get
        instance = model_class.objects.get(**get_or_create_kwargs)
    except model_class.DoesNotExist:
        # create
        instance = model_class(**get_or_create_kwargs)
    # update (or finish creating)
    for key,value in data.items():
        field = model_class._meta.get_field(key)
        if not field:
            continue
        if isinstance(field, models.ManyToManyField):
            # can't add m2m until parent is saved
            continue
        elif isinstance(field, models.ForeignKey) and hasattr(value, 'items'):
            rel_instance = create_or_update_and_get(field.rel.to, value)
            setattr(instance, key, rel_instance)
        else:
            setattr(instance, key, value)
    instance.save()
    # now add the m2m relations
    for field in model_class._meta.many_to_many:
        if field.name in data and hasattr(data[field.name], 'append'):
            for obj in data[field.name]:
                rel_instance = create_or_update_and_get(field.rel.to, obj)
                getattr(instance, field.name).add(rel_instance)
    return instance

# for example:
from django.utils.simplejson import simplejson as json

data = json.loads(ninja_json)
ninja = create_or_update_and_get(NinjaData, data)

Answer 2

我不知道你是否熟悉術語，但你基本上要做的是從序列化/字符串格式（在本例中為JSON） 反序列化為Python模型對象。

我不熟悉使用JSON執行此操作的Python庫，所以我不能推薦/認可任何內容，而是使用“python”，“deserialization”，“json”，“object”和“graph”等術語進行搜索似乎在github上顯示了一些用於序列化的Django文檔和庫jsonpickle 。

Answer 3

我實際上有同樣的需求，我寫了一個自定義數據庫字段來處理它。 只需將以下內容保存在項目的Python模塊中（例如，相應應用程序中的fields.py文件），然后導入並使用它：

class JSONField(models.TextField):
    """Specialized text field that holds JSON in the database, which is
    represented within Python as (usually) a dictionary."""

    __metaclass__ = models.SubfieldBase

    def __init__(self, blank=True, default='{}', help_text='Specialized text field that holds JSON in the database, which is represented within Python as (usually) a dictionary.', *args, **kwargs):
        super(JSONField, self).__init__(*args, blank=blank, default=default, help_text=help_text, **kwargs)

    def get_prep_value(self, value):
        if type(value) in (str, unicode) and len(value) == 0:
            value = None
        return json.dumps(value)

    def formfield(self, form_class=JSONFormField, **kwargs):
        return super(JSONField, self).formfield(form_class=form_class, **kwargs)

    def bound_data(self, data, initial):
        return json.dumps(data)

    def to_python(self, value):
        # lists, dicts, ints, and booleans are clearly fine as is
        if type(value) not in (str, unicode):
            return value

        # empty strings were intended to be null
        if len(value) == 0:
            return None

        # NaN should become null; Python doesn't have a NaN value
        if value == 'NaN':
            return None

        # try to tell the difference between a "normal" string
        # and serialized JSON
        if value not in ('true', 'false', 'null') and (value[0] not in ('{', '[', '"') or value[-1] not in ('}', ']', '"')):
            return value

        # okay, this is a JSON-serialized string
        return json.loads(value)

幾件事。 首先，如果您使用的是South，則需要向其解釋自定義字段的工作原理：

from south.modelsinspector import add_introspection_rules
add_introspection_rules([], [r'^feedmagnet\.tools\.fields\.models\.JSONField'])

其次，雖然我已經做了很多工作來確保這個自定義字段在各處都很好用，比如在序列化格式和Python之間來回干凈。 有一個地方它不能正常工作，這與將它與manage.py dumpdata結合使用時，它將Python合並到字符串而不是將其轉儲到JSON中，這不是你想要的。 我發現這在實際操作中是一個小問題。

有關編寫自定義模型字段的更多文檔

我斷言這是實現這一目標的最佳和最明顯的方法。 請注意，我還假設您不需要對此數據進行查找 - 例如，您將根據其他條件檢索記錄，這將隨之而來。 如果您需要根據JSON中的某些內容進行查找，請確保它是一個真正的SQL字段（並確保它已編入索引！）。

使用Django模型將JSON數據寫入關系數據庫的最優雅方法？

問題描述

3 個解決方案

解決方案1
10 已采納 2011-12-04 18:01:44

解決方案2
2 2011-12-04 01:37:32

解決方案3
1 2011-12-03 13:47:22

使用Django模型將JSON數據寫入關系數據庫的最優雅方法？

問題描述

3 個解決方案

解決方案1 10 已采納 2011-12-04 18:01:44

解決方案2 2 2011-12-04 01:37:32

解決方案3 1 2011-12-03 13:47:22

解決方案1
10 已采納 2011-12-04 18:01:44

解決方案2
2 2011-12-04 01:37:32

解決方案3
1 2011-12-03 13:47:22