簡體   English   中英

機器學習模型經過訓練后如何應用?

[英]How do I apply ML model after it has been trained?

我為這個幼稚的問題道歉,我已經在 python 中訓練了一個模型(朴素貝葉斯),它做得很好(95% 的准確率)。 它接受一個輸入字符串(即“Apple Inc.”或“John Doe”)並辨別它是企業名稱還是客戶名稱。

我如何在另一個數據集上實際實現這個? 如果我引入另一個 Pandas 數據框,我如何將模型從訓練數據中學到的知識應用到新數據框?

新的數據框有一個全新的人口和一組字符串,它需要預測它是企業名稱還是客戶名稱。

理想情況下,我想在新數據框中插入一個具有模型預測的列。

任何代碼片段表示贊賞。

當前模型的示例代碼:

from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test = train_test_split(df["CUST_NM_CLEAN"], 
                                                    df["LABEL"],test_size=0.20, 
                                                    random_state=1)

# Instantiate the CountVectorizer method
count_vector = CountVectorizer()

# Fit the training data and then return the matrix
training_data = count_vector.fit_transform(X_train)

# Transform testing data and return the matrix. 
testing_data = count_vector.transform(X_test)

#in this case we try multinomial, there are two other methods
from sklearn.naive_bayes import cNB
naive_bayes = MultinomialNB()
naive_bayes.fit(training_data,y_train)
#MultinomialNB(alpha=1.0, class_prior=None, fit_prior=True)

predictions = naive_bayes.predict(testing_data)


from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score
print('Accuracy score: {}'.format(accuracy_score(y_test, predictions)))
print('Precision score: {}'.format(precision_score(y_test, predictions, pos_label='Org')))
print('Recall score: {}'.format(recall_score(y_test, predictions, pos_label='Org')))
print('F1 score: {}'.format(f1_score(y_test, predictions, pos_label='Org')))

弄清楚了。

# Convert a collection of text documents to a vector of term/token counts. 
cnt_vect_for_new_data = count_vector.transform(df['new_data'])

#RUN Prediction
df['NEW_DATA_PREDICTION'] = naive_bayes.predict(cnt_vect_for_new_data)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM