简体   繁体   中英

How can I use for loop with melt and rename in pandas?

I had to melt 4 data files and then renaming the value column when I do coding on every df alone the code works but when making a for loop the code doesn't work? why?

The for loop:

data_files = [df1, df2, df3, df4]
names = ['child_mortality','income','life_expectancy','population']
i = 0

for df in data_files: 
    df = df.melt(['country'], var_name='year')
    df = df.rename(columns = {'value': names[i]}, inplace = False)
    i += 1 
df1

There is no change

enter image description here

The issue

This is because you're making changes to copies of your DataFrames.

Example:

a, b, c, d = 1, 2, 3, 4
data_files = [a, b, c, d]
print(data_files)
# > [1, 2, 3, 4]

With copies:

for x in data_files:
    x = 0
print(data_files)
# > [1, 2, 3, 4]  # No change

With indexing the list:

for i in range(len(data_files)):
    data_files[i] = 0
print(data_files)
# > [0, 0, 0, 0]  # Change

The solutions

Solution 1: Indexing the list

for i in range(len(data_files)): 
    data_files[i] = data_files[i].melt(['country'], var_name='year')
    data_files[i] = data_files[i].rename(columns = {'value': names[i]}, inplace = False)

Solution 2: Overwriting elements in the list

for i, df in enumerate(data_files): 
    df = df.melt(['country'], var_name='year')
    df = df.rename(columns = {'value': names[i]}, inplace = False)
    # Updating the list
    data_files[i] = df

Solution 3: Overwriting the whole list (or having a new list)

new_dfs = []
for df in data_files: 
    df = df.melt(['country'], var_name='year')
    df = df.rename(columns = {'value': names[i]}, inplace = False)
    new_dfs.append(df)
# If needed:
data_files = new_dfs

You are creating a new object with every iteration, and if you don't assign it back, the list of data frames remain unchanged:

df1,df2,df3,df4 = [pd.DataFrame({'country':['a','b','c'],
'v1':np.random.randint(0,10,3),
'v2':np.random.randint(0,10,3)}) for i in range(4)]

data_files = [df1, df2, df3, df4]

i = 0 
names = ['child_mortality','income','life_expectancy','population']
for df in data_files: 
    df = df.melt(['country'], var_name='year')
    df = df.rename(columns = {'value': names[i]}, inplace = False)
    data_files[i] = df
    i += 1

Now the data frames in the list are changed, but not your original dataframes:

data_files[0]

  country year  income
0       a   v1       0
1       b   v1       7
2       c   v1       9
3       a   v2       8
4       b   v2       0
5       c   v2       0

If you want to change your df1,df2 etc, it will be:

df1, df2, df3, df4 = data_files

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM