![](/img/trans.png)
[英]Convert Pandas Dataframe to_dict() with unique column values as keys
[英]Pandas and Dictionary: Convert Dict to DataFrame and use inner keys in values as DataFrame column headers
我有以下字典:
{
0: [{1: 0.0}, {2: 0.0}, {3: 0.0}, {4: 0.0}, {5: 0.0}, {6: 0.0}, {7: 0.0}, {8: 0.0}],
1: [{1: 0.0}, {2: 0.0}, {3: 0.0}, {4: 0.0}, {5: 0.0}, {6: 0.0}, {7: 0.0}, {8: 0.0}],
2: [{1: 0.21150571615476177}, {2: 0.20021993193784904}, {3: 0.24673408701244148}, {4: 0.26073319330403394}, {5: 0.0}, {6: 0.27012912297379343}, {7: 0.0}, {8: 0.0}],
3: [{1: 0.2786416467397351}, {2: 0.2006495239101905}, {3: 0.21600480247194567}, {4: 0.25724906204967557}, {5: 0.0}, {6: 0.26817162148227375}, {7: 0.0}, {8: 0.0}],
4: [{1: 0.2755030949011681}, {2: 0.20315735111595443}, {3: 0.21705903867972787}, {4: 0.2564000954604151}, {5: 0.0}, {6: 0.26903863724054405}, {7: 0.0}, {8: 0.0}],
5: [{1: 0.27334751895045045}, {2: 0.2012256178641117}, {3: 0.22266330432504813}, {4: 0.25925509529304697}, {6: 0.27562843736621906}],
6: [{1: 0.27739942084587565}, {2: 0.198682325880847}, {3: 0.2169017627591854}, {4: 0.25843774856843105}, {6: 0.26996683786070946}],
7: [{1: 0.2726461255684456}, {2: 0.19778567408338052}, {3: 0.2197858176643358}, {4: 0.26053721842016453}, {6: 0.26812789513005875}]
}
如何將此字典轉換為Pandas DataFrame ,並確保每個值中的內鍵都是對應行值的列標題?
請注意,在第5、6和7行中,缺少內部鍵5、7和8的值,這意味着我希望采用以下方式獲取DataFrame:
1 2 3 4 5 6 7 8
0 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0
1 0.000000 0.000000 0.000000 0.000000 0.000000 0.000000 0.0 0.0
2 0.211651 0.202256 0.244509 0.256969 0.000000 0.275521 0.0 0.0
3 0.273670 0.199995 0.222494 0.256303 0.000000 0.275037 0.0 0.0
4 0.280948 0.200235 0.218654 0.256737 0.000000 0.276424 0.0 0.0
5 0.281718 0.197531 0.217461 0.256043 NaN 0.271181 NaN NaN
6 0.279024 0.200089 0.218020 0.261419 NaN 0.272113 NaN NaN
7 0.278222 0.203448 0.219254 0.261846 NaN 0.269600 NaN NaN
(這些值是任意的,無論它們是什么都無所謂)。
除了我知道使用pd.to_csv()
將pd.to_csv()
輸出到CSV文件之外,我沒有起點。
任何幫助表示贊賞。 提前致謝。
(使用Ubuntu 14.04 32位VM和Python 2.7)
PS一個類似的問題未得到回答,因為它使其他用戶感到困惑,因為他們沒有正確地構築句子。 此后已被刪除。
我希望這個問題是明確和准確的。
將concat
與列表理解一起使用,然后使用一點技巧-第二層所有列的總和,將所有非NaN
的列連接在一起:
df = pd.concat({k: pd.DataFrame(v) for k,v in d.items()}, 1).stack().T.sum(level=1, axis=1)
print (df)
1 2 3 4 5 6 7 8
0 0.000000 0.000000 0.000000 0.000000 0.0 0.000000 0.0 0.0
1 0.000000 0.000000 0.000000 0.000000 0.0 0.000000 0.0 0.0
2 0.211506 0.200220 0.246734 0.260733 0.0 0.270129 0.0 0.0
3 0.278642 0.200650 0.216005 0.257249 0.0 0.268172 0.0 0.0
4 0.275503 0.203157 0.217059 0.256400 0.0 0.269039 0.0 0.0
5 0.273348 0.201226 0.222663 0.259255 NaN 0.275628 NaN NaN
6 0.277399 0.198682 0.216902 0.258438 NaN 0.269967 NaN NaN
7 0.272646 0.197786 0.219786 0.260537 NaN 0.268128 NaN NaN
詳情:
print (pd.concat({k: pd.DataFrame(v) for k,v in d.items()}, 1).stack().T)
0 1 2 3 4 5 6 7
1 2 3 4 5 6 6 7 8
0 0.000000 0.000000 0.000000 0.000000 0.0 NaN 0.000000 0.0 0.0
1 0.000000 0.000000 0.000000 0.000000 0.0 NaN 0.000000 0.0 0.0
2 0.211506 0.200220 0.246734 0.260733 0.0 NaN 0.270129 0.0 0.0
3 0.278642 0.200650 0.216005 0.257249 0.0 NaN 0.268172 0.0 0.0
4 0.275503 0.203157 0.217059 0.256400 0.0 NaN 0.269039 0.0 0.0
5 0.273348 0.201226 0.222663 0.259255 NaN 0.275628 NaN NaN NaN
6 0.277399 0.198682 0.216902 0.258438 NaN 0.269967 NaN NaN NaN
7 0.272646 0.197786 0.219786 0.260537 NaN 0.268128 NaN NaN NaN
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.