简体   繁体   中英

Make a KNN predictive model with string values?

I would like to create a prediction model that could predict the success or not of a shipment (target = success column in members) but I have features that are categories and therefore not floats and this gives me an error. Do you know if it is possible to make a prediction model like the one I want to make.

import pandas as pd 

from sklearn import metrics
from sklearn.model_selection import train_test_split
from sklearn import datasets
from sklearn.neighbors import KNeighborsClassifier
from sklearn.metrics import accuracy_score
from sklearn.metrics import confusion_matrix

members = pd.read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-09-22/members.csv")
expeditions = pd.read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-09-22/expeditions.csv", parse_dates=['basecamp_date','highpoint_date','termination_date'])
peaks = pd.read_csv("https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2020/2020-09-22/peaks.csv")



members_sk = members
members_sk2 = pd.merge(members_sk, expeditions[["expedition_id", "nbre_total_membres"]], on = "expedition_id", how="inner")
members_sk3 = pd.merge(members_sk2, peaks[["peak_id", "height_metres"]], on = "peak_id", how="inner")

members_bis = members_sk3[["peak_name","season", "sex", 'age',"citizenship","expedition_role","hired","solo", "oxygen_used", "success", "nbre_total_membres", "height_metres"]]
members_bis = members_bis.dropna()

x = members_bis.drop("success", 1)
y = members_bis["success"]
xtrain, xtest, ytrain, ytest = train_test_split(x,y,test_size=0.35, random_state=1)


model_KNN = KNeighborsClassifier(n_neighbors = 10)
model_KNN.fit(xtrain, train)
ypredict_KNN = model_KNN.predict(xtest)
print(ypredict_KNN, type(ypredict_KNN))

to train the model you must transform your categorical features to numeric there is many ways to do this the two common method are either by applying oneHotEncoder or by using labelEncoder .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM