[英]Importing Json to MongoDB with Python
I am currently trying to import a lot of json files to Mongodb, some of the jsons are simple with just object:Key:value and those json uploads I can query just fine within python. 例子
[
{
"platform_id": 28,
"mhz": 2400,
"version": "1.1.1l"
}
[
MongoDB 罗盘显示如下
问题出在其中一个工具上,在 Mongo 中创建了一个文档,我不知道如何查询。 该工具会创建一个带有系统信息的 json,并将其推送到数据库中。 例子: ...
{
"systeminfo": [
{
"component": "system board",
"description": "sys board123"
},
{
"component": "bios",
"version": "xyz",
"date": "06/28/2021"
},
{
"component": "processors",
"htt": true,
"turbo": false
},
...等总共23个对象。
如果我将它直接推入 Mongo DB,它在指南针中看起来像这样
所以问题是,有没有办法将硬件 json 折叠一层或查询数据库的方法。 我找到了一种折叠 json 的方法,但是它将每个值对移动到一个新字典中进行上传,并且每个参数都是单独完成的。 不可持续,因为该工具不断添加新字段并且需要我的应用程序来处理更改
这是硬件查询的示例,使用相同的模式适用于其他集合
db=myclient[('db_name'])]
col = db[(HW_collection]
myquery={"component":"processors"}
mydoc=col.find(myquery)
几乎总是由{"systeminfo.component":"processors"}
引起的后续问题是,对于包含至少一个processors
条目的任何数组,都将返回整个文档。 匹配并不意味着过滤。 下面是一个稍微更全面的解决方案,包括将信息“折叠”到顶级文档中。 假设输入是这样的:
{
"doc":1, "systeminfo": [
{"component": "system board","description": "sys board123"},
{"component": "bios","version": "xyz","date": "06/28/2021"},
{"component": "processors","htt": true,"turbo": false}
]
},{
"doc":2, "systeminfo": [
{"component": "RAM","description": "64G DIMM"},
{"component": "processors","htt": false,"turbo": false},
{"component": "bios","version": "abc","date": "06/28/2018"}
]
},{
"doc":3, "systeminfo": [
{"component": "RAM","description": "32G DIMM"},
{"component": "SCSI","version": "X","date": "01/01/2000"}
]
}
然后
db.foo.aggregate([
{$project: {
doc: true, // carry doc num along for ride
// Walk the $systeminfo array and filter for component = processors and
// assign to field P (temporary field, any name is fine):
P: {$filter: {input: "$systeminfo", as: "z",
cond: {$eq:["$$z.component","processors"]} }}
}}
// Remove docs that had no processors:
,{$match: {P: {$ne:[]}}}
// A little complex but read it "backwards" to better understand. The P
// array will be left with 1 entry for processors. "Lift" that doc out of
// the array with $arrayElemAt[0] and merge it with the info in the containing
// top level doc which is $$CURRENT, and then make that merged entity the
// new root (essentially the new $$CURRENT)
,{$replaceRoot: {newRoot: {$mergeObjects: [ {$arrayElemAt:["$P",0]}, "$$CURRENT" ]}} }
// Get rid of the tmp field:
,{$unset: "P"}
]);
产量
{
"component" : "processors",
"htt" : true,
"turbo" : false,
"_id" : ObjectId("61eab547ba7d8bb5090611ee"),
"doc" : 1
}
{
"component" : "processors",
"htt" : false,
"turbo" : false,
"_id" : ObjectId("61eab547ba7d8bb5090611ef"),
"doc" : 2
}
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.