简体   繁体   中英

looping each row of a pandas dataframe

I have a large dataframe and I want to divide values of the same row by eachother if a certain condition is met and create a new column for each condition.

I tried all kinds of loops but Im getting the error that the truth value of a series is ambiguos. I think Im close to the solution but I cant quite figure out the quickest way.

df = pd.DataFrame({'colA': np.random.randn(20), 'colB': np.random.randn(20), 'colC': np.random.randn(20)})
print(df)
x = 0
y = 0.5
for ix, r in df.iterrows():
    if (r['colA'] > x) & (r['colA'] < y):    
        df.loc[ix,str(y)] = df.loc[ix,'colA']/df.loc[ix,'colB']
        x += 0.5
        y += 0.5

This is how far I got now. Problematic is, that x and y increase after each row for which the condition is met. But I need the division to be carried out for ALL the rows where the condition is met, and THEN increase x and y.

You should not use iterrows if you want the division to be applied to all the lines meeting the condition. Here is a fixed version of your initial code:

while x <= df['colA'].max():
    sub = df.loc[(df['colA'] > x)&(df['colA'] < y)]  # filter the dataframe on both conditions
    df.loc[sub.index, str(y)] = df['colA']/df['colB']
    x += .5
    y += .5

while循环的两个条件需要包装在all() (或相应的pandas函数)中,以显式检查所得布尔数组的所有值是否为true。

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM