使用 SKLearn 尋找趨勢和模式

Question

所以，我有一個 dataframe 格式如下：

   Gender      Customer Type  Age   Type of Travel  Class
0   Male    Loyal Customer    13    Personal Travel Eco Plus
1   Male    disloyal Customer 25    Business travel Business
2   Female  Loyal Customer    26    Business travel Business
3   Female  Loyal Customer    25    Business travel Business
4   Male    Loyal Customer    61    Business travel Business
5   Male    disloyal Customer 20    Business travel Eco
6 Female    disloyal Customer 24    Business travel Eco

我想創建一個更願意成為忠實客戶的人的檔案，這樣我就可以產生洞察力來正確解決他們。 為此，我被推薦了圖書館 SKLearn，但我進行了研究並沒有找到一個好的方法，因為它是一個非常大的圖書館。 所以，如果有人在這個庫中有一些經驗，或者建議使用另一個庫，你能否指出我正確的方向，並解釋獲得該結果的最佳功能？ 順便說一句，沒有必要成為一個完美的方法，我正在尋找一些簡單的代碼並且會給出一個一般的答案

Answer 1

最簡單的 model 將是邏輯回歸。 例如，只有年齡具有預測價值的代碼：

import numpy as np
import pandas as pd
from sklearn import linear_model

ages_loyal = np.random.normal(50, 100, 500)
ages_disloyal = np.random.normal(35, 10, 500)
ages_loyal = ages_loyal[(ages_loyal > 17) & (ages_loyal < 96)]
ages_disloyal = ages_disloyal[(ages_disloyal > 17) & (ages_disloyal < 96)]

df_loyal = pd.DataFrame({'Gender': np.random.choice(['Male', 'Female'], size=ages_loyal.shape[0], replace=True, p=None), 'Customer Type': ['Loyal Customer'] * ages_loyal.shape[0], 'Age': ages_loyal.astype(np.int64), 'Type of Travel': np.random.choice(['Personal Travel', 'Business travel'], size=ages_loyal.shape[0], replace=True, p=[0.09, 0.91]), 'Class': np.random.choice(['Eco Plus', 'Business', 'Eco'], size=ages_loyal.shape[0], replace=True, p=None)})
df_disloyal = pd.DataFrame({'Gender': np.random.choice(['Male', 'Female'], size=ages_disloyal.shape[0], replace=True, p=None), 'Customer Type': ['disloyal Customer'] * ages_disloyal.shape[0], 'Age': ages_disloyal.astype(np.int64), 'Type of Travel': np.random.choice(['Personal Travel', 'Business travel'], size=ages_disloyal.shape[0], replace=True, p=[0.09, 0.91]), 'Class': np.random.choice(['Eco Plus', 'Business', 'Eco'], size=ages_disloyal.shape[0], replace=True, p=None)})
df = pd.concat([df_loyal, df_disloyal])

model = linear_model.LogisticRegression()
model.fit(pd.get_dummies(df.iloc[:,df.columns != 'Customer Type']),df['Customer Type'])

predicted = model.predict(pd.get_dummies(df.iloc[:,df.columns != 'Customer Type']))

results = pd.DataFrame({'predicted': predicted, 'actual': df['Customer Type']})

pd.crosstab(results['predicted'], results['actual'])

使用 SKLearn 尋找趨勢和模式

問題描述

1 個解決方案

解決方案1
0 2022-09-07 17:41:05

使用 SKLearn 尋找趨勢和模式

問題描述

1 個解決方案

解決方案1 0 2022-09-07 17:41:05

解決方案1
0 2022-09-07 17:41:05