I have a large dataframe and I want to divide values of the same row by eachother if a certain condition is met and create a new column for each condition.
I tried all kinds of loops but Im getting the error that the truth value of a series is ambiguos. I think Im close to the solution but I cant quite figure out the quickest way.
df = pd.DataFrame({'colA': np.random.randn(20), 'colB': np.random.randn(20), 'colC': np.random.randn(20)})
print(df)
x = 0
y = 0.5
for ix, r in df.iterrows():
if (r['colA'] > x) & (r['colA'] < y):
df.loc[ix,str(y)] = df.loc[ix,'colA']/df.loc[ix,'colB']
x += 0.5
y += 0.5
This is how far I got now. Problematic is, that x and y increase after each row for which the condition is met. But I need the division to be carried out for ALL the rows where the condition is met, and THEN increase x and y.
You should not use iterrows if you want the division to be applied to all the lines meeting the condition. Here is a fixed version of your initial code:
while x <= df['colA'].max():
sub = df.loc[(df['colA'] > x)&(df['colA'] < y)] # filter the dataframe on both conditions
df.loc[sub.index, str(y)] = df['colA']/df['colB']
x += .5
y += .5
while循环的两个条件需要包装在all()
(或相应的pandas函数)中,以显式检查所得布尔数组的所有值是否为true。
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.