Ok, So I manually wrote out a function to do this, but I am wondering if there is a built in python/pandas/numpy/... function for this. Essentially what I want is
data_col = data.loc[data['col3'] == 'a']
data_final = data_col['col2']
But I want it for all of the values of col3. So it goes from:
Note how the values from col1 are not present. If there is no function that you can think of that does something like this, don't worry about making one. I already finished it and it suited my needs. Just curious, I haven't finished all of my courses in school yet so I am not super familiar with all of the functions.
Code
df = df.pivot_table(
'col2', columns='col3', aggfunc=(lambda x:x.to_list())
).apply(pd.Series.explode).rename_axis(None, axis=1).reset_index(drop=True))
Output
a b c d e
0 6 7 8 9 10
1 10 9 8 7 6
Explanation
Try:
for v in df["col3"].unique():
m = df["col3"] == v
df.loc[m, "tmp"] = range(m.sum())
df = df.pivot(index="tmp", columns="col3", values="col2").rename_axis("")
print(df)
Prints:
col3 a b c d e
0.0 6 7 8 9 10
1.0 10 9 8 7 6
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.