简体   繁体   中英

Passing columns to rows on python pandas

I have a data frame like this

 id  a  b  c
101  0  3  0
102  2  0  5
103  0  1  4

and I want something like this

 id  letter  num
101     a     0
101     b     3
101     c     0
102     a     2
102     b     0
102     c     5
103     a     0
103     b     1
103     c     4

I want to pass the column names to values of a row with their corresponding id and the result of the df.

I was trying to make it in a loop, introducing each element according to its id, but it's horrible. Is there an easy way to do this?

You could melt and then sort:

>>> pd.melt(df, id_vars='id', value_vars=['a','b','c'], 
            var_name='letter', value_name='num').sort_values('id')
    id letter  num
0  101      a    0
3  101      b    3
6  101      c    0
1  102      a    2
4  102      b    0
7  102      c    5
2  103      a    0
5  103      b    1
8  103      c    4

If you want to reset the index, you could always use .reset_index(drop=True) on the returned DataFrame.

You probably want to do something along the lines of

df2 = df.stack().reset_index(1)

then rename the columns

df2.columns = ['letter', 'num']

df.stack() produces a series with a multi-level index:

In [5]: df.stack()
Out[5]: 
id    
101  a    0
     b    3
     c    0
102  a    2
     b    0
     c    5
103  a    0
     b    1
     c    4
dtype: int64

You want to then break up this multi-level index into columns of their own. Using reset_index(1) leaves the id column as the index. But note the other columns have wrong names (corresponding to their former levels in the multi-level index).

In [6]: df.stack().reset_index(1)
Out[6]: 
    level_1  0
id            
101       a  0
101       b  3
101       c  0
102       a  2
102       b  0
102       c  5
103       a  0
103       b  1
103       c  4

So rename them:

In [8]: df2.columns = ['letter', 'num']

In [9]: df2
Out[9]: 
    letter  num
id             
101      a    0
101      b    3
101      c    0
102      a    2
102      b    0
102      c    5
103      a    0
103      b    1
103      c    4

You can use ravel to reshape the values back into one column.

df = pd.DataFrame({'id': [101, 102, 103], 
                   'a': [0, 2, 0], 
                   'b': [3, 0, 1], 
                   'c': [0, 5, 4]})[['id', 'a', 'b', 'c']]

>>> pd.DataFrame({'letter': df.columns[1:].tolist() * len(df), 
                  'num': df.iloc[:, 1:].values.ravel()})
  letter  num
0      a    0
1      b    3
2      c    0
3      a    2
4      b    0
5      c    5
6      a    0
7      b    1
8      c    4

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM