简体   繁体   English

重塑和重新排列熊猫表

[英]reshaping and rearranging a pandas table

I have the following dataframe (pandas version 0.13.1) 我有以下数据框(熊猫版0.13.1)

>>> import pandas as pd
>>> DF = pd.DataFrame({'Group':['G1','G1','G2','G2'],'Start':['10','10','12','13'],'End':['13','13','14','15'],'Sample':['S1','S2','S3','S3'],'Status':['yes','yes','no','yes'],'pValue':[0.13,0.12,0.96,0.76],'pValueString':['13/100','12/100','96/100','76/100'],'desc':['aaaaaa','bbbbbb','aaaaaa','cccccc']})
>>> DF
  End Group Sample Start Status  pValue pValueString desc
0  13    G1     S1    10    yes    0.13       13/100 aaaaaa   
1  13    G1     S2    10     no    0.12       12/100 bbbbbb
2  14    G2     S3    12     no    0.96       96/100 aaaaaa
3  15    G2     S3    13    yes    0.76       76/100 cccccc

[4 rows x 8 columns] [4行x 8列]

To the dataframe above 到上面的数据框

  1. I would like to groupby 'Group'. 我想按“分组”分组。
  2. Then groupby a Start-End couplet. 然后按开始对结束对分组。
  3. Pivot the sample values for each group. 旋转每个组的样本值。 aggregate by max(pValue) 通过max(pValue)聚合
  4. Get the corresponding Status, desc corresponding to the sample with the higher pvalue and replace its value with a pValueString. 获取与具有较高pvalue的样本相对应的Status,desc,然后将其值替换为pValueString。

I need to ultimately get this to the following format 我最终需要将此格式转换为以下格式

Group Start End Sample           Status  desc
                    S1   S2
G1    10    13    13/100 12/100  yes     aaaaaa
                    S3
G2    12    14    96/100         no      aaaaaa
      13    15    76/100         yes     cccccc

I have tried to use pivot_table and groupby but to no avail. 我试图使用pivot_table和groupby,但无济于事。 Any help would be much appreciated. 任何帮助将非常感激。

I have 我有

grouped=DF.groupby('Group') grouped = DF.groupby('Group')

for g,v in grouped: pandas.pivot_table(data=v,values=['pValue','pValueString']),rows=['Group','Start','End'],cols=['Sample'])['pValueString'] 对于g,v分组:pandas.pivot_table(data = v,values = ['pValue','pValueString']),rows = ['Group','Start','End'],cols = ['Sample' ])['pValueString']

How do I get the corresponding desc and Status? 如何获得相应的desc和状态?

For pandas pivot table, you pass the rows you want as index and the columns you want as colums : 对于大熊猫透视表,你传递你想为行index和要作为列colums

pvt = DF.pivot_table(index = ['Group','Start','End','Status'], columns = ['Sample'])
pvt
Out[209]: 
                       pValue            
Sample                     S1    S2    S3
Group Start End Status                   
G1    10    13  yes      0.13  0.12   NaN
G2    12    14  no        NaN   NaN  0.96
      13    15  yes       NaN   NaN  0.76

Then for your 然后为你

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM