I am trying to expand out a dataset that has two columns and expand it out in python.
Basket | Times
______________|_______
Bread | 5
Orange, Bread | 3
I would like, based on the number in the Times column that many rows. So for the example above
Newcolumn
_______
Bread1
Bread2
Bread3
Bread4
Bread5
Orange, Bread1
Orange, Bread2
Orange, Bread3
Use np.repeat
to repeat each value the required number of times. Then groupby
and cumcount
to add the required suffixes:
import numpy as np
srs = np.repeat(df["Basket"],df["Times"])
output = (srs+srs.groupby(level=0).cumcount().add(1).astype(str)).reset_index(drop=True)
>>> output
0 Bread1
1 Bread2
2 Bread3
3 Bread4
4 Bread5
5 Orange, Bread1
6 Orange, Bread2
7 Orange, Bread3
dtype: object
You can try apply
on rows to generate desired list and explode
the column
df['Newcolumn'] = df.apply(lambda row: [f"{row['Basket']}_{i+1}" for i in range(row['Times'])], axis=1)
df = df.explode('Newcolumn', ignore_index=True)
print(df)
Basket Times Newcolumn
0 Bread 5 Bread_1
1 Bread 5 Bread_2
2 Bread 5 Bread_3
3 Bread 5 Bread_4
4 Bread 5 Bread_5
5 Orange, Bread 3 Orange, Bread_1
6 Orange, Bread 3 Orange, Bread_2
7 Orange, Bread 3 Orange, Bread_3
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.