簡體   English   中英

熊貓和字典:將Dict轉換為DataFrame並將值中的內部鍵用作DataFrame列標題

[英]Pandas and Dictionary: Convert Dict to DataFrame and use inner keys in values as DataFrame column headers

我有以下字典:

{
 0: [{1: 0.0}, {2: 0.0}, {3: 0.0}, {4: 0.0}, {5: 0.0}, {6: 0.0}, {7: 0.0}, {8: 0.0}], 
 1: [{1: 0.0}, {2: 0.0}, {3: 0.0}, {4: 0.0}, {5: 0.0}, {6: 0.0}, {7: 0.0}, {8: 0.0}], 
 2: [{1: 0.21150571615476177}, {2: 0.20021993193784904}, {3: 0.24673408701244148}, {4: 0.26073319330403394}, {5: 0.0}, {6: 0.27012912297379343}, {7: 0.0}, {8: 0.0}], 
 3: [{1: 0.2786416467397351}, {2: 0.2006495239101905}, {3: 0.21600480247194567}, {4: 0.25724906204967557}, {5: 0.0}, {6: 0.26817162148227375}, {7: 0.0}, {8: 0.0}], 
 4: [{1: 0.2755030949011681}, {2: 0.20315735111595443}, {3: 0.21705903867972787}, {4: 0.2564000954604151}, {5: 0.0}, {6: 0.26903863724054405}, {7: 0.0}, {8: 0.0}], 
 5: [{1: 0.27334751895045045}, {2: 0.2012256178641117}, {3: 0.22266330432504813}, {4: 0.25925509529304697}, {6: 0.27562843736621906}], 
 6: [{1: 0.27739942084587565}, {2: 0.198682325880847}, {3: 0.2169017627591854}, {4: 0.25843774856843105}, {6: 0.26996683786070946}], 
 7: [{1: 0.2726461255684456}, {2: 0.19778567408338052}, {3: 0.2197858176643358}, {4: 0.26053721842016453}, {6: 0.26812789513005875}]
}  

如何將此字典轉換為Pandas DataFrame ,並確保每個值中的內鍵都是對應行值的列標題?
請注意,在第5、6和7行中,缺少內部鍵5、7和8的值,這意味着我希望采用以下方式獲取DataFrame:

          1         2         3         4         5         6    7    8
0  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.0  0.0
1  0.000000  0.000000  0.000000  0.000000  0.000000  0.000000  0.0  0.0
2  0.211651  0.202256  0.244509  0.256969  0.000000  0.275521  0.0  0.0
3  0.273670  0.199995  0.222494  0.256303  0.000000  0.275037  0.0  0.0
4  0.280948  0.200235  0.218654  0.256737  0.000000  0.276424  0.0  0.0
5  0.281718  0.197531  0.217461  0.256043       NaN  0.271181  NaN  NaN
6  0.279024  0.200089  0.218020  0.261419       NaN  0.272113  NaN  NaN
7  0.278222  0.203448  0.219254  0.261846       NaN  0.269600  NaN  NaN  

(這些值是任意的,無論它們是什么都無所謂)。
除了我知道使用pd.to_csv()pd.to_csv()輸出到CSV文件之外,我沒有起點。
任何幫助表示贊賞。 提前致謝。
(使用Ubuntu 14.04 32位VM和Python 2.7)

PS一個類似的問題未得到回答,因為它使其他用戶感到困惑,因為他們沒有正確地構築句子。 此后已被刪除。
我希望這個問題是明確和准確的。

concat與列表理解一起使用,然后使用一點技巧-第二層所有列的總和,將所有非NaN的列連接在一起:

df = pd.concat({k: pd.DataFrame(v) for k,v in d.items()}, 1).stack().T.sum(level=1, axis=1)
print (df)
          1         2         3         4    5         6    7    8
0  0.000000  0.000000  0.000000  0.000000  0.0  0.000000  0.0  0.0
1  0.000000  0.000000  0.000000  0.000000  0.0  0.000000  0.0  0.0
2  0.211506  0.200220  0.246734  0.260733  0.0  0.270129  0.0  0.0
3  0.278642  0.200650  0.216005  0.257249  0.0  0.268172  0.0  0.0
4  0.275503  0.203157  0.217059  0.256400  0.0  0.269039  0.0  0.0
5  0.273348  0.201226  0.222663  0.259255  NaN  0.275628  NaN  NaN
6  0.277399  0.198682  0.216902  0.258438  NaN  0.269967  NaN  NaN
7  0.272646  0.197786  0.219786  0.260537  NaN  0.268128  NaN  NaN

詳情:

print (pd.concat({k: pd.DataFrame(v) for k,v in d.items()}, 1).stack().T)
          0         1         2         3    4                   5    6    7
          1         2         3         4    5         6         6    7    8
0  0.000000  0.000000  0.000000  0.000000  0.0       NaN  0.000000  0.0  0.0
1  0.000000  0.000000  0.000000  0.000000  0.0       NaN  0.000000  0.0  0.0
2  0.211506  0.200220  0.246734  0.260733  0.0       NaN  0.270129  0.0  0.0
3  0.278642  0.200650  0.216005  0.257249  0.0       NaN  0.268172  0.0  0.0
4  0.275503  0.203157  0.217059  0.256400  0.0       NaN  0.269039  0.0  0.0
5  0.273348  0.201226  0.222663  0.259255  NaN  0.275628       NaN  NaN  NaN
6  0.277399  0.198682  0.216902  0.258438  NaN  0.269967       NaN  NaN  NaN
7  0.272646  0.197786  0.219786  0.260537  NaN  0.268128       NaN  NaN  NaN

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM