繁体   English   中英

使用 Python 将 Json 导入 MongoDB

[英]Importing Json to MongoDB with Python

I am currently trying to import a lot of json files to Mongodb, some of the jsons are simple with just object:Key:value and those json uploads I can query just fine within python. 例子

[
    {
        "platform_id": 28,
        "mhz": 2400,
        "version": "1.1.1l" 
    }
[

MongoDB 罗盘显示如下

问题出在其中一个工具上,在 Mongo 中创建了一个文档,我不知道如何查询。 该工具会创建一个带有系统信息的 json,并将其推送到数据库中。 例子: ...

{
    "systeminfo": [
        {
            "component": "system board",
            "description": "sys board123"
        },
        {
            "component": "bios",
            "version": "xyz",
            "date": "06/28/2021"
        },
        {
            "component": "processors",
            "htt": true,
            "turbo": false
        },

...等总共23个对象。

如果我将它直接推入 Mongo DB,它在指南针中看起来像这样

所以问题是,有没有办法将硬件 json 折叠一层或查询数据库的方法。 我找到了一种折叠 json 的方法,但是它将每个值对移动到一个新字典中进行上传,并且每个参数都是单独完成的。 不可持续,因为该工具不断添加新字段并且需要我的应用程序来处理更改

这是硬件查询的示例,使用相同的模式适用于其他集合

db=myclient[('db_name'])]
col = db[(HW_collection]
myquery={"component":"processors"}
mydoc=col.find(myquery)

几乎总是由{"systeminfo.component":"processors"}引起的后续问题是,对于包含至少一个processors条目的任何数组,都将返回整个文档 匹配并不意味着过滤。 下面是一个稍微更全面的解决方案,包括将信息“折叠”到顶级文档中。 假设输入是这样的:

{
    "doc":1, "systeminfo": [
    {"component": "system board","description": "sys board123"},
    {"component": "bios","version": "xyz","date": "06/28/2021"},
        {"component": "processors","htt": true,"turbo": false}
    ]
},{
    "doc":2, "systeminfo": [
    {"component": "RAM","description": "64G DIMM"},
        {"component": "processors","htt": false,"turbo": false},
    {"component": "bios","version": "abc","date": "06/28/2018"}
    ]
},{
    "doc":3, "systeminfo": [
    {"component": "RAM","description": "32G DIMM"},
    {"component": "SCSI","version": "X","date": "01/01/2000"}
    ]
}

然后

db.foo.aggregate([
    {$project: {
        doc: true,  // carry doc num along for ride
        // Walk the $systeminfo array and filter for component = processors and
        // assign to field P (temporary field, any name is fine):

        P: {$filter: {input: "$systeminfo", as: "z",
                      cond: {$eq:["$$z.component","processors"]} }}
    }}

    // Remove docs that had no processors:
    ,{$match: {P: {$ne:[]}}}

    // A little complex but read it "backwards" to better understand.  The P
    // array will be left with 1 entry for processors.  "Lift" that doc out of
    // the array with $arrayElemAt[0] and merge it with the info in the containing
    // top level doc which is $$CURRENT, and then make that merged entity the
    // new root (essentially the new $$CURRENT)
    ,{$replaceRoot: {newRoot: {$mergeObjects: [ {$arrayElemAt:["$P",0]}, "$$CURRENT" ]}} }

    // Get rid of the tmp field:
    ,{$unset: "P"}
]);

产量

{
    "component" : "processors",
    "htt" : true,
    "turbo" : false,
    "_id" : ObjectId("61eab547ba7d8bb5090611ee"),
    "doc" : 1
}
{
    "component" : "processors",
    "htt" : false,
    "turbo" : false,
    "_id" : ObjectId("61eab547ba7d8bb5090611ef"),
    "doc" : 2
}

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM