I have the following data frame structure:
SC0 Shape S1 S2 S3 C1 C2 C3 D1 D2 D3
2 1 Circle NaN NaN NaN 1 1 1 NaN NaN NaN
3 13 Square 2 1 2 NaN NaN NaN NaN NaN NaN
4 13 Diamond NaN NaN NaN NaN NaN NaN 2 1 2
5 16 Diamond NaN NaN NaN NaN NaN NaN 2 2 2
6 16 Square 2 2 2 NaN NaN NaN NaN NaN NaN
How can I combine S1,S2,S3 with C1,C2,C3,D1,D2,D3 so S1,C1 and D1 are on the same column, S2,C2 and D2...(all the way to S16,C16 and D16)?
When Shape = Circle the populated columns are C1-C16, when Shape = Square its S1-S16 and for Shape = Diamond its D1-D16.
I don't mind creating a new set of columns or copy two of them to to an existing set, as long as I have all the #1 scores in the same column, #2 same column etc.
Thank you!
IIUC you have an equal amount of columns for each category, and you want to compress this into numeric columns which are shape agnostic. If so this will work:
dfs = []
for var in ['S', 'D', 'C']:
# filter columns with a regex
res = df[df.iloc[:, 2:].filter(regex= var + '\d{1,2}').columns].dropna()
# rename coumns with just numbers to enable concatenation
res.columns = range(3)
dfs.append(res)
df = pd.concat([df.iloc[:, :2], pd.concat(dfs)], 1)
print(df)
Output:
SC0 Shape 0 1 2
2 1 Circle 1.0 1.0 1.0
3 13 Square 2.0 1.0 2.0
4 13 Diamond 2.0 1.0 2.0
5 16 Diamond 2.0 2.0 2.0
6 16 Square 2.0 2.0 2.0
Try:
n=3
cols_prefixes=["C", "S", "D"]
for i in range(n):
cols=[f"{l}{i+1}" for l in cols_prefixes]
df[f"res{i+1}"]=df[cols].bfill(axis=1).iloc[:,0]
df=df.drop(columns=cols)
Outputs:
SC0 Shape res1 res2 res3
2 1 Circle 1.0 1.0 1.0
3 13 Square 2.0 1.0 2.0
4 13 Diamond 2.0 1.0 2.0
5 16 Diamond 2.0 2.0 2.0
6 16 Square 2.0 2.0 2.0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.