How to modify a Pandas dataframe while iterating over it

Question

So I have a dataframe that I am iterating over, and about halfway through the df I want to modify a column name but continue my iteration. I have code like this:

for index, row in df.iterrows():
    do something with row

    if certain condition is met:
        df.rename(columns={'old_name':'new_name'}, inplace=True)

After I do the rename, the column name is changed in the 'df' variable for subsequent iterations, but the value of 'row' still contains the old column name. How can I fix this? I know I have encountered similar situations in pandas before. Maybe the iterator doesn't get updated even the dataframe itself is modified?

Answer 1

Changing the source of something you're iterating over is not a good practice.

You could set a flag if the condition is met, and then after the iteration, make any necessary changes to the dataframe.

Edited to add: I have a large dataset that needs "line by line" parsing, but that instruction was given to me by a non-programmer. Here's what I did: I added a boolean condition to the dataframe, split the dataframe into two separate dataframes based on that condition, stored one for later integration and moved on with the other dataframe. At the end I used pd.concat to put everything back together. But if you change a column name that pd.concat will create extra columns in the end.

How to modify a Pandas dataframe while iterating over it

Question

1 answers

solution1
0 2022-03-24 17:45:06

How to modify a Pandas dataframe while iterating over it

Question

1 answers

solution1 0 2022-03-24 17:45:06

solution1
0 2022-03-24 17:45:06