简体   繁体   中英

Pandas df re-ordering columns seems to work within a loop, but doesn't. What the heck am I missing?

So I'm completely perplexed as to why this is happening:

I have 8 different Pandas dataframes, with same columns. I want to rearrange the columns equally on all of them. So I created a list and tried this:

original_cols = [1, 48, 49, 50, 51, 52]
new_cols = [48, 49, 50, 51, 52, 1]

list_of_dfs = [df1, df2, df3...., df8]

for df in list_of_dfs:
    df = df[new_cols]

When I look at any of the dataframes, I still get the old column order, why? I inserted a print statement as below, and the loop does what I want:

for df in list_of_dfs:
    print (df.columns.tolist())
    df = df[new_cols]
    print (df.columns.tolist())

Output (for df1):
[1, 48, 49, 50, 51, 52]
[48, 49, 50, 51, 52, 1]

I can just write out all manually, but thought a simple loop would be better but can't get it to work. I must be missing some fundamental understanding of loops or something. Any help is greatly appreciated.

Current solution:

df1 = df1[new_cols]
df2 = df2[new_cols]
.
.
```

When you assign df = df[new_cols] it is not updating the DataFrame in the list. Try this:

size_ = len(list_of_dfs)
for idx in range(size_):
    list_of_dfs[idx] = list_of_dfs[idx][new_cols]

Now idx will represent an index location in list_of_dfs and you can just update the DataFrame columns at each index.

You are referring to a copy of the DataFrame object. If you need to swap variable names in the global scope (not recommended), you may use globals to refer to the object itself.

import re
for df in [name for name in globals() if re.findall('df\d+', name)]:
    globals()[df] = globals()[df][new_cols]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM