簡體   English   中英

使用 Python 將 Json 導入 MongoDB

[英]Importing Json to MongoDB with Python

I am currently trying to import a lot of json files to Mongodb, some of the jsons are simple with just object:Key:value and those json uploads I can query just fine within python. 例子

[
    {
        "platform_id": 28,
        "mhz": 2400,
        "version": "1.1.1l" 
    }
[

MongoDB 羅盤顯示如下

問題出在其中一個工具上,在 Mongo 中創建了一個文檔,我不知道如何查詢。 該工具會創建一個帶有系統信息的 json,並將其推送到數據庫中。 例子: ...

{
    "systeminfo": [
        {
            "component": "system board",
            "description": "sys board123"
        },
        {
            "component": "bios",
            "version": "xyz",
            "date": "06/28/2021"
        },
        {
            "component": "processors",
            "htt": true,
            "turbo": false
        },

...等總共23個對象。

如果我將它直接推入 Mongo DB,它在指南針中看起來像這樣

所以問題是,有沒有辦法將硬件 json 折疊一層或查詢數據庫的方法。 我找到了一種折疊 json 的方法,但是它將每個值對移動到一個新字典中進行上傳,並且每個參數都是單獨完成的。 不可持續,因為該工具不斷添加新字段並且需要我的應用程序來處理更改

這是硬件查詢的示例,使用相同的模式適用於其他集合

db=myclient[('db_name'])]
col = db[(HW_collection]
myquery={"component":"processors"}
mydoc=col.find(myquery)

幾乎總是由{"systeminfo.component":"processors"}引起的后續問題是,對於包含至少一個processors條目的任何數組,都將返回整個文檔 匹配並不意味着過濾。 下面是一個稍微更全面的解決方案,包括將信息“折疊”到頂級文檔中。 假設輸入是這樣的:

{
    "doc":1, "systeminfo": [
    {"component": "system board","description": "sys board123"},
    {"component": "bios","version": "xyz","date": "06/28/2021"},
        {"component": "processors","htt": true,"turbo": false}
    ]
},{
    "doc":2, "systeminfo": [
    {"component": "RAM","description": "64G DIMM"},
        {"component": "processors","htt": false,"turbo": false},
    {"component": "bios","version": "abc","date": "06/28/2018"}
    ]
},{
    "doc":3, "systeminfo": [
    {"component": "RAM","description": "32G DIMM"},
    {"component": "SCSI","version": "X","date": "01/01/2000"}
    ]
}

然后

db.foo.aggregate([
    {$project: {
        doc: true,  // carry doc num along for ride
        // Walk the $systeminfo array and filter for component = processors and
        // assign to field P (temporary field, any name is fine):

        P: {$filter: {input: "$systeminfo", as: "z",
                      cond: {$eq:["$$z.component","processors"]} }}
    }}

    // Remove docs that had no processors:
    ,{$match: {P: {$ne:[]}}}

    // A little complex but read it "backwards" to better understand.  The P
    // array will be left with 1 entry for processors.  "Lift" that doc out of
    // the array with $arrayElemAt[0] and merge it with the info in the containing
    // top level doc which is $$CURRENT, and then make that merged entity the
    // new root (essentially the new $$CURRENT)
    ,{$replaceRoot: {newRoot: {$mergeObjects: [ {$arrayElemAt:["$P",0]}, "$$CURRENT" ]}} }

    // Get rid of the tmp field:
    ,{$unset: "P"}
]);

產量

{
    "component" : "processors",
    "htt" : true,
    "turbo" : false,
    "_id" : ObjectId("61eab547ba7d8bb5090611ee"),
    "doc" : 1
}
{
    "component" : "processors",
    "htt" : false,
    "turbo" : false,
    "_id" : ObjectId("61eab547ba7d8bb5090611ef"),
    "doc" : 2
}

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM