简体   繁体   中英

difference between df.loc[:, columns] and df.loc[:][columns]

I want to normalize some columns of a pandas data frame using MinMaxScaler in this way:

scaler = MinMaxScaler()
numericals = ["TX_TIME_SECONDS",'TX_Amount']

while I do in this way:

df.loc[:][numericals] = scaler.fit_transform(df.loc[:][numericals])

it's not done inplace and df is not changed;

whereas, when I do in this way:

df.loc[:, numericals] = scaler.fit_transform(df.loc[:][numericals])

the numerical columns of df are changed in place,

So, What's the difference between df.loc[:, ~] and df.loc[:][~]

df.loc[:][numericals] selects all rows and then selects columns "TX_TIME_SECONDS" and 'TX_Amount' of the returning object , and assigns some value to it. The problem is, the returning object might be a copy so this may not change the actual DataFrame.

The correct way of making this assignment is using df.loc[:, numericals] , because with .loc you are guaranteed to modify the original DataFrame.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM