[英]python dynamic nested dictionary to csv
下面得到的output來自查詢結果。
{'_id': ObjectId('651f3e6e5723b7c1'), 'fruits': {'pineapple': '2', 'grape': '0', 'apple': 'unknown'},'day': 'Tues', 'month': 'July', 'address': 'long', 'buyer': 'B1001', 'seller': 'S1301', 'date': {'date': 210324}}
{'_id': ObjectId('651f3e6e5723b7c1'), 'fruits': {'lemon': '2', 'grape': '0', 'apple': 'unknown', 'strawberry': '1'},'day': 'Mon', 'month': 'January', 'address': 'longer', 'buyer': 'B1001', 'seller': 'S1301', 'date': {'date': 210324}}
#worked but not with fruits and dynamic header
date = json.dumps(q['date']) #convert it to string
date = re.split("(:|\}| )", date)[4] #and split to get value
for q in db.fruits.aggregate(query):
print('"' + q['day'] + '","' + q['month'] + '","' + date + '","' + q['time'] + '","' + q['buyer'] + '","' + q['seller'] + '"')
#below close to what I want but having issue with nested and repeated rows
ffile = open("fruits.csv", "w")
w = csv.DictWriter(ffile, q.keys())
w.writeheader()
w.writerow(q)
我想從中創建一個 csv 。
我能夠得到與下表完全相同的所有東西,但不能得到水果。 我被困在嵌套字典字段和動態表 header 中。
Mongoexport 目前對我不起作用。
字段 fruits 每次可以有更多不同的嵌套鍵和值。
我目前仍在嘗試/探索 csv.writer 並嘗試添加條件,如果我發現嵌套字典。 [如果我設法創建 csv 將更新答案]
創建這個 csv 的提示會很高興。 如果有人分享類似問題的鏈接,謝謝。
不是問題!
我們需要展平深層結構,以便我們可以從那里所有可能的密鑰形成 CSV。 這需要遞歸 function (此處為flatten_dict
)來獲取輸入字典並將其轉換為不包含更多字典的 output 字典; 在這里,鍵是元組,例如('foo', 'bar', 'baz')
。
我們在所有輸入行上運行 function,收集我們在known_keys
集合中遇到的鍵。
該集合已排序(因為我們假設原始字典也沒有真正的內在順序)並且點連接以重新形成 CSV header 行。
然后,簡單地迭代和寫入扁平行(注意為不存在的值寫入一個空字符串)。
output 例如
_id,address,buyer,date.date,day,fruits.apple,fruits.grape,fruits.lemon,fruits.pineapple,fruits.strawberry,month,seller
651f3e6e5723b7c1,long,B1001,210324,Tues,unknown,0,,2,,July,S1301
651f3e6e5723b7c2,longer,B1001,210324,Mon,unknown,0,2,,1,January,S1301
import csv
import sys
rows = [
{
"_id": "651f3e6e5723b7c1",
"fruits": {"pineapple": "2", "grape": "0", "apple": "unknown"},
"day": "Tues",
"month": "July",
"address": "long",
"buyer": "B1001",
"seller": "S1301",
"date": {"date": 210324},
},
{
"_id": "651f3e6e5723b7c2",
"fruits": {
"lemon": "2",
"grape": "0",
"apple": "unknown",
"strawberry": "1",
},
"day": "Mon",
"month": "January",
"address": "longer",
"buyer": "B1001",
"seller": "S1301",
"date": {"date": 210324},
},
]
def flatten_dict(d: dict) -> dict:
"""
Flatten hierarchical dicts into a dict of path tuples -> deep values.
"""
out = {}
def _flatten_into(into, pairs, prefix=()):
for key, value in pairs:
p_key = prefix + (key,)
if isinstance(value, list):
_flatten_into(into, enumerate(list), p_key)
elif isinstance(value, dict):
_flatten_into(into, value.items(), p_key)
else:
out[p_key] = value
_flatten_into(out, d.items())
return out
known_keys = set()
flat_rows = []
for row in rows:
flat_row = flatten_dict(row)
known_keys |= set(flat_row.keys())
flat_rows.append(flat_row)
ordered_keys = sorted(known_keys)
writer = csv.writer(sys.stdout)
writer.writerow([".".join(map(str, key)) for key in ordered_keys])
for flat_row in flat_rows:
writer.writerow([str(flat_row.get(key, "")) for key in ordered_keys])
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.