Alternative way of writing for loop and if in python when working with a dataframe to make it faster

Question

I have a data frame named 'plans_to_csv' looking like this:

I need to do the following analysis to realize what is the actual mode. But this takes so long to run. Is there an alternative way for writing this code to make it faster? Thanks a lot for your help in advance.

for i in range (0, len(plans_to_csv)-2):
    if (plans_to_csv['mode'][i+1]=='walk' and plans_to_csv['type'][i+2]=='car interaction' and 
        plans_to_csv['person_id'][i]==plans_to_csv['person_id'][i+2]):

        plans_to_csv['actual_mode_car'][i]=1

Answer 1

You can shift the columsn and do comparisons. That will make use of vectorization and should be faster.

selection = (plans_to_csv['mode'].shift(-1) == 'walk') & (plans_to_csv['type'].shift(-2)=='car interaction') & (plans_to_csv['person_id'] == plans_to_csv['person_id'].shift(-2))
plans_to_csv['actual_mode_car']= selection.astype(int)

Note that this sets all the entries to 0 that don't match the comparison. If this is not wanted, you can just do plans_to_csv['actual_mode_car'][selection]= 1

Alternative way of writing for loop and if in python when working with a dataframe to make it faster

Question

1 answers

solution1
1 2022-01-11 12:41:12

Alternative way of writing for loop and if in python when working with a dataframe to make it faster

Question

1 answers

solution1 1 2022-01-11 12:41:12

solution1
1 2022-01-11 12:41:12