[英]Create dataframe with Repeated rows based on column value
我正在嘗試擴展一個包含兩列的數據集,並將其擴展到 python。
Basket | Times
______________|_______
Bread | 5
Orange, Bread | 3
我想,基於 Times 列中的數字,有多少行。 所以對於上面的例子
Newcolumn
_______
Bread1
Bread2
Bread3
Bread4
Bread5
Orange, Bread1
Orange, Bread2
Orange, Bread3
使用np.repeat
將每個值重復所需的次數。 然后groupby
和cumcount
添加需要的后綴:
import numpy as np
srs = np.repeat(df["Basket"],df["Times"])
output = (srs+srs.groupby(level=0).cumcount().add(1).astype(str)).reset_index(drop=True)
>>> output
0 Bread1
1 Bread2
2 Bread3
3 Bread4
4 Bread5
5 Orange, Bread1
6 Orange, Bread2
7 Orange, Bread3
dtype: object
您可以嘗試在行上apply
以生成所需的列表並explode
列
df['Newcolumn'] = df.apply(lambda row: [f"{row['Basket']}_{i+1}" for i in range(row['Times'])], axis=1)
df = df.explode('Newcolumn', ignore_index=True)
print(df)
Basket Times Newcolumn
0 Bread 5 Bread_1
1 Bread 5 Bread_2
2 Bread 5 Bread_3
3 Bread 5 Bread_4
4 Bread 5 Bread_5
5 Orange, Bread 3 Orange, Bread_1
6 Orange, Bread 3 Orange, Bread_2
7 Orange, Bread 3 Orange, Bread_3
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.