I have a dict
of dicts
that I'm trying to make into a Pandas
DataFrame
. The dict
is structured to be the index mapped to a dict
that maps column indexes to their value, and then I want everything else in the DataFrame
to be 0. For example:
d = {0: {0:2, 2:5},
1: {1:1, 3:2},
2: {2:5}}
So then then I want the DataFrame
to look like
index c0 c1 c2 c3
0 2.0 NaN 5.0 NaN
1 NaN 1.0 NaN 2.0
2 NaN NaN 5.0 NaN
I currently am planning on writing a function that will yield
a tuple from each item from d
and using that as an iterable for creating the DataFrame
, but am interested in if anyone else has done anything similar.
Just simple call DataFrame.from_dict
pd.DataFrame.from_dict(d,'index').sort_index(axis=1)
0 1 2 3
0 2.0 NaN 5.0 NaN
1 NaN 1.0 NaN 2.0
2 NaN NaN 5.0 NaN
Well, why not doing it in the regular way and transposing it:
>>> pd.DataFrame(d).T
0 1 2 3
0 2.0 NaN 5.0 NaN
1 NaN 1.0 NaN 2.0
2 NaN NaN 5.0 NaN
>>>
After time testing the other suggestions, I found my original method was a lot faster. I am using the following function to make an iterator that I pass into pd.DataFrame
def row_factory(index_data, row_len):
"""
Make a generator for iterating for index_data
Parameters:
index_data (dict): a dict mapping the a value to a dict of index mapped to values. All indexes not in
second dict are assumed to be None.
row_len (int): length of row
Example:
index_data = {0: {0:2, 2:1}, 1: {1:1}} would yield [0, 2, None, 1] then [1, None, 1, None]
"""
for key, data in index_data.items():
# Initialize row with the key starting, then None for each value
row = [key] + [None] * (row_len - 1)
for index, value in data.items():
# Only replace indexes that have a value
row[index] = value
yield row
df = pd.DataFrame(row_factory(d), 5)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.