My task is to drop all rows containing NaNs and encode all the categorical variables inside of data.
I wrote a function that looks like
def preprocess_data(data):
data = data.dropna()
le = LabelEncoder()
data['car name'] = le.fit_transform(data['car name'])
return data
which takes a dataframe and returns a processed data. Running this function gives me a warning that says:
SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead
I don't quite get which part of my code is causing this and how to fix it.
Make sure you tell pandas that data
it is its own data frame (and not a slice) by using:
def preprocess_data(data):
data = data.dropna().copy()
le = LabelEncoder()
data['car name'] = le.fit_transform(data['car name'])
return data
A more detailed explanation here: https://github.com/pandas-dev/pandas/issues/17476
Maybe you should give more information and / or the problem is not in the method. The following code does not produce warning.
def preprocess_data(data):
data = data.dropna()
le = preprocessing.LabelEncoder()
data['car name'] = le.fit_transform(data['car name'])
return data
preprocess_data(pd.DataFrame({'car name': ['nissan', 'dacia'], 'car mode': ['juke', 'logan']}))
# car mode car name
# 0 juke 1
# 1 logan 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.