[英]Dataframe from a dict of lists of dicts?
我有一個字典列表的字典。 將其轉換為DataFrame
中的 DataFrame 的最有效方法是什么?
data = {
"0a2":[{"a":1,"b":1},{"a":1,"b":1,"c":1},{"a":1,"b":1}],
"279":[{"a":1,"b":1,"c":1},{"a":1,"b":1,"d":1}],
"ae2":[{"a":1,"b":1},{"a":1,"d":1},{"a":1,"b":1},{"a":1,"d":1}],
#...
}
import pandas as pd
pd.DataFrame(data, columns=["a","b","c","d"])
我試過的:
一種解決方案是通過復制“id”鍵來像這樣對數據進行非規范化:
bad_data = [
{"a":1,"b":1,"id":"0a2"},{"a":1,"b":1,"c":1,"id":"0a2"},{"a":1,"b":1,"id":"0a2"},
{"a":1,"b":1,"c":1,"id":"279"},{"a":1,"b":1,"d":1,"id":"279"},
{"a":1,"b":1,"id":"ae2"},{"a":1,"d":1,"id":"ae2"},{"a":1,"b":1,"id":"ae2"},{"a":1,"d":1,"id":"ae2"}
]
pd.DataFrame(bad_data, columns=["a","b","c","d","id"])
但是我的數據非常大,所以我更喜歡其他一些分層索引解決方案。
IIUC,你可以做(推薦)
new_df = pd.concat((pd.DataFrame(d) for d in data.values()), keys=data.keys())
Output:
a b c d
0a2 0 1 1.0 NaN NaN
1 1 1.0 1.0 NaN
2 1 1.0 NaN NaN
279 0 1 1.0 1.0 NaN
1 1 1.0 NaN 1.0
ae2 0 1 1.0 NaN NaN
1 1 NaN NaN 1.0
2 1 1.0 NaN NaN
3 1 NaN NaN 1.0
或者
pd.concat(pd.DataFrame(v).assign(ID=k) for k,v in data.items())
Output:
a b c ID d
0 1 1.0 NaN 0a2 NaN
1 1 1.0 1.0 0a2 NaN
2 1 1.0 NaN 0a2 NaN
0 1 1.0 1.0 279 NaN
1 1 1.0 NaN 279 1.0
0 1 1.0 NaN ae2 NaN
1 1 NaN NaN ae2 1.0
2 1 1.0 NaN ae2 NaN
3 1 NaN NaN ae2 1.0
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.