[英]Make Pandas dataframe from dict of dict that contain index mapped to value
我有一個dict
的dicts
,我試圖使成Pandas
DataFrame
。 將該dict
構造為映射到將列索引映射為其值的dict
索引,然后我希望DataFrame
其他所有內容DataFrame
0。例如:
d = {0: {0:2, 2:5},
1: {1:1, 3:2},
2: {2:5}}
所以那么我希望DataFrame
看起來像
index c0 c1 c2 c3
0 2.0 NaN 5.0 NaN
1 NaN 1.0 NaN 2.0
2 NaN NaN 5.0 NaN
我目前正在計划編寫一個函數,該函數將從d
每個項目中yield
一個元組,並將其用作創建DataFrame
的可迭代DataFrame
,但是我對是否還有其他人做過類似的事情感興趣。
只需簡單地調用DataFrame.from_dict
pd.DataFrame.from_dict(d,'index').sort_index(axis=1)
0 1 2 3
0 2.0 NaN 5.0 NaN
1 NaN 1.0 NaN 2.0
2 NaN NaN 5.0 NaN
好吧,為什么不按常規方式進行處理和移置它:
>>> pd.DataFrame(d).T
0 1 2 3
0 2.0 NaN 5.0 NaN
1 NaN 1.0 NaN 2.0
2 NaN NaN 5.0 NaN
>>>
經過時間測試其他建議,我發現我原來的方法要快得多。 我正在使用以下函數來制作傳遞給pd.DataFrame
的迭代器
def row_factory(index_data, row_len):
"""
Make a generator for iterating for index_data
Parameters:
index_data (dict): a dict mapping the a value to a dict of index mapped to values. All indexes not in
second dict are assumed to be None.
row_len (int): length of row
Example:
index_data = {0: {0:2, 2:1}, 1: {1:1}} would yield [0, 2, None, 1] then [1, None, 1, None]
"""
for key, data in index_data.items():
# Initialize row with the key starting, then None for each value
row = [key] + [None] * (row_len - 1)
for index, value in data.items():
# Only replace indexes that have a value
row[index] = value
yield row
df = pd.DataFrame(row_factory(d), 5)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.