简体   繁体   中英

How to predict values of column in new Python data frame using info from the old data frame

Let's assume I have two data frames df1 and df2. In df1 I have several columns such as userid, sexid, location and etc. And in df2 I have all the same columns as in df1 except for sexid which I need to fill using some prediction algorithm. I am just a beginner and I tried another kind of problems. So any advice or useful references which may help me to crack it are welcomed.

A minimal example:

import pandas as pd
from sklearn.linear_model import LogisticRegression

df1 = pd.DataFrame({'sexid': list('MMFFMFFMMF'), 'x1': [0, 12, 2, 3, 4, 2, 0, 12, 12, 12], 'x2': [0, 1, 1, 1, 0, 1, 1, 0, 0, 1]})

df2 = pd.DataFrame({'x1': [0, 12, 2, 3, 4, 2, 0, 12, 12, 12], 'x2': [0, 1, 1, 1, 0, 1, 1, 0, 0, 1]})

X = df1[['x1', 'x2']]
y = df1['sexid']

model = LogisticRegression()

model.fit(X, y)

model.predict(df2)

Which returns:

array(['F', 'M', 'F', 'F', 'M', 'F', 'F', 'M', 'M', 'M'], dtype=object)

I would highly recommend you read this .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM