logistic.fit() erorr in LogisticRegression

Question

I am trying to use logistic.fit() i get the eror below. How to fix the error ?

 Input contains NaN, infinity or a value too large for dtype('float64').

Here is a part of my code: (Floor and Surname are strings)

   xtr = pd.get_dummies([['Age','Fee', 'Size','Floor', 'Class', 'Surname' ]])
    import pandas as pd
    from sklearn.linear_model import LogisticRegression
    logistic = LogisticRegression()
    logistic.fit(xtr , ytr)

Answer 1

So you have several options.

Option 1 :

df_clean = df.dropna()

This drops all NA values. Not recommended if you have few observations.

Option 2 :

df["Column_Name"].fillna(df["Column_Name"].mean(), inplace=True)

This replaces all missing values with the mean, alternatively you can do median as well. Obviously this will only work for numerical columns.

Option 3 :

df = df[pd.notnull(df['Column_Name'])]

Here you can specify which columns you want to drop NaN values from. This will work in conjunction with Option 2 if some of your columns are categorical and others are numeric.

Option 4 :

df.fillna(0)

Fill all your NaN values with 0. You can do this instead of Option 2, your call. Anyways this should be enough to get you started on thinking how you can resolve your problem. Since you are familiar with the data, you should know best how to handle this. If you have any specific questions about that I would be more than happy to help.

logistic.fit() erorr in LogisticRegression

Question

1 answers

solution1
0 ACCPTED 2016-12-27 20:53:22

logistic.fit() erorr in LogisticRegression

Question

1 answers

solution1 0 ACCPTED 2016-12-27 20:53:22

solution1
0 ACCPTED 2016-12-27 20:53:22