简体   繁体   中英

Generate new dataframe using keys of exisiting data frame as column names

I have a data frame with some outer keys generated by pandas concat function which looks like this

               ID    ratio    log_q
L-D  0      A5A614  2.51803  2.09644
     1      P00370  3.76811  5.92205
     2      P00393  1.74254  3.74875
     3    P00452-2  3.37144  6.13225
     4      P00547  3.06521  5.55512
     5      P00561  3.02943  5.58718
                ID    ratio    log_q
M-D  0      A5A614  2.51803  2.09644
     1      P00370  3.76811  5.92205
     2      P00393  1.74254  3.74875
     3    P00452-2  3.37144  6.13225
     4      P00547  3.06521  5.55512
     5      P00561  3.02943  5.58718
                ID    ratio    log_q
M3-D  0      A5A614  2.51803  2.09644
     1      P00370  3.76811  5.92205
     2      P00393  1.74254  3.74875
     3    P00452-2  3.37144  6.13225
     4      P00547  3.06521  5.55512
     5      P00561  3.02943  5.58718

I would like to use concat again to generate a new dataframe, which takes the ratio column for all keys ('L-D', 'M-D', 'M3-D') and uses these keys as names for the new columns. In addition, the new dataframe should be aligned for matching 'ID's in the following way:

          L-D    M-D      M3-D
A5A614    2.51803  1.13223  2.64402
P00393    3.76811  1.97461  3.34965
P00547    1.74254  2.70024   2.3655
...

When I use

pd.concat([df.ix['L-D']['ratio'], df.ix['M-D']['ratio'], df.ix['M3-D']['ratio']], 
axis=1, levels=("L-D","M-D","M3-D"))

or

pd.concat([df.ix['L-D']['ratio'], df.ix['M-D']['ratio'], df.ix['M3-D']['ratio']], 
axis=1, names=("L-D","M-D","M3-D"))

I can create a data frame but the result looks like this:

       ratio    ratio    ratio
0    2.51803  1.13223  2.64402
1    3.76811  1.97461  3.34965
2    1.74254  2.70024   2.3655

Apparently, the names/levels are not used and it just takes the numerical index but not the 'ID'

I think you need add parameter keys to concat not levels :

#remove first level and append column ID:
df = df.reset_index(level=1, drop=True).set_index('ID', append=True)

print pd.concat([df.ix['L-D']['ratio'], df.ix['M-D']['ratio'], df.ix['M3-D']['ratio']], 
                axis=1, 
                keys=["L-D","M-D","M3-D"])

              L-D      M-D     M3-D
ID                                 
A5A614    2.51803  2.51803  2.51803
P00370    3.76811  3.76811  3.76811
P00393    1.74254  1.74254  1.74254
P00452-2  3.37144  3.37144  3.37144
P00547    3.06521  3.06521  3.06521
P00561    3.02943  3.02943  3.02943

But I think better is use pd.pivot with get_level_values :

print pd.pivot(index=df.ID, columns=df.index.get_level_values(0), values=df.ratio)
              L-D      M-D     M3-D
ID                                 
A5A614    2.51803  2.51803  2.51803
P00370    3.76811  3.76811  3.76811
P00393    1.74254  1.74254  1.74254
P00452-2  3.37144  3.37144  3.37144
P00547    3.06521  3.06521  3.06521
P00561    3.02943  3.02943  3.02943

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM