简体   繁体   中英

Create MultiIndex pandas DataFrame from dictionary with tuple keys

I'd like to efficiently create a pandas DataFrame from a Python collections.Counter dictionary .. but there's an additional requirement.

The Counter dictionary looks like this:

(a, b) : 5
(c, d) : 7
(a, d) : 2

Those dictionary keys are tuples where the first is to become the row, and the second the column of the dataframe.

The resulting DataFrame should look like this:

   b  d
a  5  2
c  0  7

For larger data I don't want to create a dataframe using the growth method df[a][b]= 5 etc as that is incredibly inefficient as it creates a copy of the new dataframe every time such an extension is done (I'm let to believe).

Perhaps the right answer is to go via a numpy array ?

I would create a Series using MultiIndex.from_tuples and then unstack it.

keys, values = zip(*counter.items())
idx = pd.MultiIndex.from_tuples(keys)

pd.Series(values, index=idx).unstack(-1, fill_value=0)

   b  d
a  5  2
c  0  7

Using DataFrame constructor with stack :

pd.DataFrame(counter, index=[0]).stack().loc[0].T

     b    d
a  5.0  2.0
c  NaN  7.0

Using Series with unstack

pd.Series(d).unstack(fill_value=0)
Out[708]: 
   b  d
a  5  2
c  0  7

Input data

d={('a', 'b') : 5,
('c', 'd') : 7,
('a', 'd') : 2}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM