简体   繁体   中英

Fit or predict function for LinearDiscriminantAnalysis

I'm trying to assign coordinates to a label based on that labels known coordinates using SciKit-learns Linear Discriminant Analysis package. Training coordinates and label stored in one pandas dataframe, target coordiantes in another. The two dataframes aren't equal in row length, training set is larger. I want to apply the label on the coordinates in the original dataframe to use as a key with pd.merge.

I know i could approach this problem using matplot point in polygon or Shapely but want to test it this way. Here's what i have based on the docs

from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
labels_fea = le.fit_transform(Spatial_index['Postcode']) 
trainingdata=df1[['xcoord','ycoord']].values
targetvalues=df2[['xcoord','ycoord']].values
clf = LinearDiscriminantAnalysis(solver='svd', shrinkage=None, priors=None,      
n_components=None, store_covariance=False, tol=0.0001)

Then executed as below,

clf.fit(trainingdata,targetvalues) 

This throws the following error,

ValueError: bad input shape (8860, 2)

I think you're getting confused by target and test. The error is happening because the classifier expects a one-dimesional array of labels -- in your case, the postcodes. Without seeing your data I can't say for sure, but you probably want to do

clf.fit(trainingdata, labels_fea) 

and then renaming targetdata to testdata , you would get your predictions to test your model with clf.predict(testdata)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM