简体   繁体   中英

Creating Pivot DataFrame using Multiple Columns in Pandas

I have a pandas dataframe following the form in the example below:

data = {'id': [1,1,1,1,2,2,2,2,3,3,3], 'a': [-1,1,1,0,0,0,-1,1,-1,0,0], 'b': [1,0,0,-1,0,1,1,-1,-1,1,0]}
df = pd.DataFrame(data)

Now, what I want to do is create a pivot table such that for each of the columns except the id, I will have 3 new columns corresponding to the values. That is, for column a , I will create a_neg , a_zero and a_pos . Similarly, for b , I will create b_neg , b_zero and b_pos . The values for these new columns would correspond to the number of times those values appear in the original a and b column. The final dataframe should look like this:

result = {'id': [1,2,3], 'a_neg': [1, 1, 1],
      'a_zero': [1, 2, 2], 'a_pos': [2, 1, 0],
      'b_neg': [1, 1, 1], 'b_zero': [2,1,1], 'b_pos': [1,2,1]}
df_result = pd.DataFrame(result)

Now, to do this, I can do the following steps and arrive at my final answer:

by_a = df.groupby(['id', 'a']).count().reset_index().pivot('id', 'a', 'b').fillna(0).astype(int)
by_a.columns = ['a_neg', 'a_zero', 'a_pos']

by_b = df.groupby(['id', 'b']).count().reset_index().pivot('id', 'b', 'a').fillna(0).astype(int)
by_b.columns = ['b_neg', 'b_zero', 'b_pos']

df_result = by_a.join(by_b).reset_index()

However, I believe that that method is not optimal especially if I have a lot of original columns aside from a and b . Is there a shorter and/or more efficient solution for getting what I want to achieve here? Thanks.

A shorter solution, though still quite in-efficient:

In [11]: df1 = df.set_index("id")

In [12]: g = df1.groupby(level=0)

In [13]: g.apply(lambda x: x.apply(lambda x: x.value_counts())).fillna(0).astype(int).unstack(1)
Out[13]:
    a        b
   -1  0  1 -1  0  1
id
1   1  1  2  1  2  1
2   1  2  1  1  1  2
3   1  2  0  1  1  1

Note: I think you should be aiming for the multi-index columns.


I'm reasonably sure I've seen a trick to remove the apply/value_count/fillna with something cleaner and more efficient, but at the moment it eludes me...

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM