简体   繁体   中英

How to combine multiple rows into a single row in pandas dataframe grouping using a subset of the row string

I would like to transform the format of this pandas dataframe from

  individual_id  Rec  Sig
0    C11 part 1  0.2  0.8
1    C11 part 2  0.1  0.9
2    C12 part 1  0.3  0.7
3    C12 part 2  0.5  0.5
4    C13 part 1  0.1  0.9
5    C13 part 2  0.7  0.3

to this format

  individual_id  Rec 1  Rec 2  Sig 1  Sig 2
0           C11    0.2    0.1    0.8    0.9
1           C12    0.3    0.5    0.7    0.5
2           C13    0.1    0.7    0.9    0.3

Where Rec 1 and Rec 2 now represent the parts of individual_id but in a single row. However, for some individual_id there may be 3 parts. I hope this makes sense. I tried using df.groupby but it seems more difficult to do with multiple parts of the row names. Hope someone can help. Thank you in advance!

It would be helpful if you add the data with more than two parts.

For your current case, You can extract the required values from individual_id before reshaping with pivot :

reshape = df.assign(
    num=df.individual_id.str[-1], individual_id=df.individual_id.str[:3]
).pivot("individual_id", "num")

# it could also be " ".join(x)
reshape.columns = reshape.columns.map("_".join)
reshape.reset_index()


    individual_id   Rec_1   Rec_2   Sig_1   Sig_2
0           C11     0.2     0.1     0.8     0.9
1           C12     0.3     0.5     0.7     0.5
2           C13     0.1     0.7     0.9     0.3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM