I want to normalize some columns of a pandas data frame using MinMaxScaler
in this way:
scaler = MinMaxScaler()
numericals = ["TX_TIME_SECONDS",'TX_Amount']
while I do in this way:
df.loc[:][numericals] = scaler.fit_transform(df.loc[:][numericals])
it's not done inplace and df
is not changed;
whereas, when I do in this way:
df.loc[:, numericals] = scaler.fit_transform(df.loc[:][numericals])
the numerical columns of df
are changed in place,
So, What's the difference between df.loc[:, ~]
and df.loc[:][~]
df.loc[:][numericals]
selects all rows and then selects columns "TX_TIME_SECONDS" and 'TX_Amount' of the returning object , and assigns some value to it. The problem is, the returning object might be a copy so this may not change the actual DataFrame.
The correct way of making this assignment is using df.loc[:, numericals]
, because with .loc
you are guaranteed to modify the original DataFrame.
I suggest you read some documentation because this is pretty basic.
https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.loc.html
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.