Create MultiIndex pandas DataFrame from dictionary with tuple keys

Question

I'd like to efficiently create a pandas DataFrame from a Python collections.Counter dictionary .. but there's an additional requirement.

The Counter dictionary looks like this:

(a, b) : 5
(c, d) : 7
(a, d) : 2

Those dictionary keys are tuples where the first is to become the row, and the second the column of the dataframe.

The resulting DataFrame should look like this:

   b  d
a  5  2
c  0  7

For larger data I don't want to create a dataframe using the growth method df[a][b]= 5 etc as that is incredibly inefficient as it creates a copy of the new dataframe every time such an extension is done (I'm let to believe).

Perhaps the right answer is to go via a numpy array ?

Answer 1

I would create a Series using MultiIndex.from_tuples and then unstack it.

keys, values = zip(*counter.items())
idx = pd.MultiIndex.from_tuples(keys)

pd.Series(values, index=idx).unstack(-1, fill_value=0)

   b  d
a  5  2
c  0  7

Using DataFrame constructor with stack :

pd.DataFrame(counter, index=[0]).stack().loc[0].T

     b    d
a  5.0  2.0
c  NaN  7.0

Answer 2

Using Series with unstack

pd.Series(d).unstack(fill_value=0)
Out[708]: 
   b  d
a  5  2
c  0  7

Input data

d={('a', 'b') : 5,
('c', 'd') : 7,
('a', 'd') : 2}

Create MultiIndex pandas DataFrame from dictionary with tuple keys

Question

2 answers

solution1
6 2019-01-18 17:41:17

solution2
6 ACCPTED 2019-01-18 17:47:58

Create MultiIndex pandas DataFrame from dictionary with tuple keys

Question

2 answers

solution1 6 2019-01-18 17:41:17

solution2 6 ACCPTED 2019-01-18 17:47:58

solution1
6 2019-01-18 17:41:17

solution2
6 ACCPTED 2019-01-18 17:47:58