I have a pandas Data Frame.
I want to replicate every row in the Data Frame that in the column 'quantity' has more than one and change the value of that row to -1 for every created until it reaches one.
Item Weight Bags Must quantity must quantity bags column length assigned bag
0 planes bag 8.50 planes v 1 1 6 None
1 Full Bandolera 3.76 planes v 3 2 6 None
2 tail 0.30 planes <NA> 3 2 6 None
3 central wing 1.08 planes <NA> 3 2 6 None
4 engine 0.44 planes <NA> 3 2 6 None
5 height steer 0.12 planes <NA> 3 2 6 None
6 dihedral 0.40 planes <NA> 3 2 6 None
7 pods bag 8.72 pods v 1 1 4 None
8 Pod 1.74 pods v 3 2 4 None
9 optic 0.86 pods v 2 2 4 None
10 thermal 1.20 pods v 3 2 4 None
So, for example, the Full Bandolera row's quantity will become 1 and there will be two duplicates of it.
Item Weight Bags Must quantity must quantity bags column length assigned bag
0 planes bag 8.50 planes v 1 1 6 None
1 Full Bandolera 3.76 planes v 1 2 6 None
2 Full Bandolera 3.76 planes v 1 2 6 None
3 Full Bandolera 3.76 planes v 1 2 6 None
4 tail 0.30 planes <NA> 3 2 6 None
5 central wing 1.08 planes <NA> 3 2 6 None
6 engine 0.44 planes <NA> 3 2 6 None
7 height steer 0.12 planes <NA> 3 2 6 None
8 dihedral 0.40 planes <NA> 3 2 6 None
9 pods bag 8.72 pods v 1 1 4 None
10 Pod 1.74 pods v 3 2 4 None
11 optic 0.86 pods v 2 2 4 None
12 thermal 1.20 pods v 3 2 4 None
So far, Iv'e got this code:
def multiply_row(cls):
print(cls.df.dtypes)
for row in cls.df.iterrows():
while row['quantity'] > 1:
row_to_list = list(row)
listed_row = row_to_list.copy()
add_to_df = tuple(listed_row)
cls.df.append(add_to_df)
row['quantity'] = row['quantity'] - 1
return cls.df
output:
while row['quantity'] > 1:
TypeError: tuple indices must be integers or slices, not str
Item string
Weight float64
Bags string
Must object
quantity int64
must quantity int64
category object
bags column length int64
assigned bag object
assigned_bag object
I'm very not sure of the method I wrote, I'm very new to pandas.
UPDATE:
using Quang Hoang's answer, no errors are being raised.
Yet, the Data Frame remains the same.
def multiply_row():
for idx, row in df.iterrows():
while row['quantity'] > 1:
(df.loc[df.index.repeat(df.quantity)]
.assign(quantity=1))
return df
returns the exact same Data Frame.
I think repeat
:
(df.loc[df.index.repeat(df.quantity)]
.assign(quantity=1)
)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.