[英]using more than one row or column value in a pandas dataframe for a calculation
[英]Pandas, replicate every row that have a more than one value
我有一個 pandas 數據框。
我想復制“數量”列中的數據框中的每一行,並將該行的值更改為 -1,直到它達到一個。
Item Weight Bags Must quantity must quantity bags column length assigned bag
0 planes bag 8.50 planes v 1 1 6 None
1 Full Bandolera 3.76 planes v 3 2 6 None
2 tail 0.30 planes <NA> 3 2 6 None
3 central wing 1.08 planes <NA> 3 2 6 None
4 engine 0.44 planes <NA> 3 2 6 None
5 height steer 0.12 planes <NA> 3 2 6 None
6 dihedral 0.40 planes <NA> 3 2 6 None
7 pods bag 8.72 pods v 1 1 4 None
8 Pod 1.74 pods v 3 2 4 None
9 optic 0.86 pods v 2 2 4 None
10 thermal 1.20 pods v 3 2 4 None
因此,例如,Full Bandolera 行的數量將變為 1,並且會有兩個副本。
Item Weight Bags Must quantity must quantity bags column length assigned bag
0 planes bag 8.50 planes v 1 1 6 None
1 Full Bandolera 3.76 planes v 1 2 6 None
2 Full Bandolera 3.76 planes v 1 2 6 None
3 Full Bandolera 3.76 planes v 1 2 6 None
4 tail 0.30 planes <NA> 3 2 6 None
5 central wing 1.08 planes <NA> 3 2 6 None
6 engine 0.44 planes <NA> 3 2 6 None
7 height steer 0.12 planes <NA> 3 2 6 None
8 dihedral 0.40 planes <NA> 3 2 6 None
9 pods bag 8.72 pods v 1 1 4 None
10 Pod 1.74 pods v 3 2 4 None
11 optic 0.86 pods v 2 2 4 None
12 thermal 1.20 pods v 3 2 4 None
到目前為止,我得到了這個代碼:
def multiply_row(cls):
print(cls.df.dtypes)
for row in cls.df.iterrows():
while row['quantity'] > 1:
row_to_list = list(row)
listed_row = row_to_list.copy()
add_to_df = tuple(listed_row)
cls.df.append(add_to_df)
row['quantity'] = row['quantity'] - 1
return cls.df
output:
while row['quantity'] > 1:
TypeError: tuple indices must be integers or slices, not str
Item string
Weight float64
Bags string
Must object
quantity int64
must quantity int64
category object
bags column length int64
assigned bag object
assigned_bag object
我非常不確定我寫的方法,我對 pandas 很陌生。
更新:
使用 Quang Hoang 的回答,沒有出現任何錯誤。
然而,數據框保持不變。
def multiply_row():
for idx, row in df.iterrows():
while row['quantity'] > 1:
(df.loc[df.index.repeat(df.quantity)]
.assign(quantity=1))
return df
返回完全相同的數據框。
我認為repeat
:
(df.loc[df.index.repeat(df.quantity)]
.assign(quantity=1)
)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.