[英]how to convert mongoDB document into python?
我有一个mongoDB文档,我想将其转换为pandas数据框
db.dataset2.insert(
{
"user_id" : "user_3",
"order_id" : "order_3",
"order_lat " : -73.9557413, ## Order location
"order_long" : 40.7720266,
"order_time" : datetime.utcnow(),
"dish" : [
{
"dish_id" : "005" ,
"dish_name" : "Sandwitch",
"dish_substitute" : "Yes",
"substitute_name" : "Null",
"dish_type" : "Veg", ## Binary response (Veg or Non-Veg)
"dish_price" : 50,
"dish_quantity" : 1,
"ratings" : 3,
"reviews" : "blah blah blah",
"home_chef_name" : "ghyty",
"expert_chef_name" : "abc" ,
"coupon_applied" : "Yes", ## Binary response (Yes or No)
"coupon_type" : "Rs 20 off"
},
{
"dish_id" : "006" ,
"dish_name" : "Chicken Hundi",
"dish_substitute" : "No",
"substitute_name" : "Null",
"dish_type" : "Non-Veg",
"dish_price" : 125,
"dish_quantity" : 1,
"ratings" : 3,
"reviews" : "blah blah blah",
"home_chef_name" : "rtyu",
"expert_chef_name" : "vbghy" ,
"coupon_applied" : "No",
"coupon_type" : "Null"
}
],
})
当我执行以下操作时
df = pd.DataFrame(list(db.dataset2.find()))
它给了我以下输出
_id \
0 566148e3691db01e0cac9d82
1 56615926691db01e0cac9d83
2 56615c64691db01e0cac9d84
dish order_id order_lat
0 [{u'dish_substitute': u'Yes', u'home_chef_name... order_1 -73.955741
1 [{u'dish_substitute': u'Yes', u'home_chef_name... order_2 -73.955741
2 [{u'dish_substitute': u'Yes', u'home_chef_name... order_3 -73.955741
order_long order_time user_id
0 40.772027 2015-12-04 08:03:47.658 user_1
1 40.772027 2015-12-04 09:13:10.642 user_2
2 40.772027 2015-12-04 09:27:00.497 user_3
菜是json数组。 当我将其转换为数据框时,它会添加盘列,并将所有内容放在该列下。 我想将其转换为数据框以进行数据探索。 怎么做? 我希望将其转换为以下格式。
_id order_id order_lat order_long
0 566148e3691db01e0cac9d82 order_1 -73.955741 40.772027
1 566148e3691db01e0cac9d82 order_1 -73.955741 40.772027
order_time user_id coupon_applied coupon_type dish_id
0 2015-12-04 08:03:47.658 user_1 Yes Rs 20 off 001
1 2015-12-04 08:03:47.658 user_1 No Null 001
dish_name dish_price dish_quantity dish_substitute dish_type
0 Chicken Biryani 120 1 Yes Non-Veg
1 Paneer Biryani 100 1 Yes Veg
expert_chef_name home_chef_name ratings reviews substitute_name
0 abc xyx 4 blah blah blah Rice
1 abc abc 3 blah blah blah Paratha
请帮助..在此先感谢:)
你可以创建一个临时的DataFrame
从记录df.dish
并加入它回到原来的df
。
像这样:
df = pd.DataFrame(list(db.dataset2.find()))
tf = pd.DataFrame.from_records(df.dish)
df = df.join(tf)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.