How to combine multiple rows into a single row in pandas dataframe grouping using a subset of the row string

Question

I would like to transform the format of this pandas dataframe from

  individual_id  Rec  Sig
0    C11 part 1  0.2  0.8
1    C11 part 2  0.1  0.9
2    C12 part 1  0.3  0.7
3    C12 part 2  0.5  0.5
4    C13 part 1  0.1  0.9
5    C13 part 2  0.7  0.3

to this format

  individual_id  Rec 1  Rec 2  Sig 1  Sig 2
0           C11    0.2    0.1    0.8    0.9
1           C12    0.3    0.5    0.7    0.5
2           C13    0.1    0.7    0.9    0.3

Where Rec 1 and Rec 2 now represent the parts of individual_id but in a single row. However, for some individual_id there may be 3 parts. I hope this makes sense. I tried using df.groupby but it seems more difficult to do with multiple parts of the row names. Hope someone can help. Thank you in advance!

Answer 1

It would be helpful if you add the data with more than two parts.

For your current case, You can extract the required values from individual_id before reshaping with pivot :

reshape = df.assign(
    num=df.individual_id.str[-1], individual_id=df.individual_id.str[:3]
).pivot("individual_id", "num")

# it could also be " ".join(x)
reshape.columns = reshape.columns.map("_".join)
reshape.reset_index()


    individual_id   Rec_1   Rec_2   Sig_1   Sig_2
0           C11     0.2     0.1     0.8     0.9
1           C12     0.3     0.5     0.7     0.5
2           C13     0.1     0.7     0.9     0.3

How to combine multiple rows into a single row in pandas dataframe grouping using a subset of the row string

Question

1 answers

solution1
2 2021-03-08 05:13:12

How to combine multiple rows into a single row in pandas dataframe grouping using a subset of the row string

Question

1 answers

solution1 2 2021-03-08 05:13:12

solution1
2 2021-03-08 05:13:12