简体   繁体   中英

Linear regression suing Scikitlearn(linear regression)

Here is my scenarion.

data = [[25593.14, 39426.66],
        [98411.00, 81869.75],
        [71498.80, 62495.80],
        [38068.00, 54774.00],
        [58188.00, 43453.65],
        [10220.00, 18465.25]]

About data is my data model.

x-cordinates refers "Salary" y-cordinates refers "Expenses"

I want to predict the expense when I give "Salary" ie, X-coordinate.

Here is my sample code. Please help me out.

from sklearn.linear_model import LinearRegression

data = [[25593.14, 39426.66],
        [98411.00, 81869.75],
        [71498.80, 62495.80],
        [38068.00, 54774.00],
        [58188.00, 43453.65],
        [10220.00, 18465.25]]

salary=[]
expenses=[]

for dataset in data:
    # import pdb; pdb.set_trace()
    salary.append(dataset[0])
    expenses.append(dataset[1])

model = LinearRegression()
model.fit(salary, expenses)
prediction = model.predict([10200.00])
print(prediction)

Error which I got:

ValueError: Expected 2D array, got 1D array instead:
array=[ 25593.14  98411.    71498.8   38068.    58188.    10220.  ].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample

.

As suggested by the comments, something like this would be a better way to work with data you want to feed into a scikit learn model. Another example can be seen here .

from sklearn.linear_model import LinearRegression
import numpy as np

data = np.array(
        [[25593.14, 39426.66],
        [98411.00, 81869.75],
        [71498.80, 62495.80],
        [38068.00, 54774.00],
        [58188.00, 43453.65],
        [10220.00, 18465.25]]
).T

salary = data[0].reshape(-1, 1)
expenses = data[1]

model = LinearRegression()
model.fit(salary, expenses)
prediction = model.predict(np.array([10200.00]).reshape(-1, 1))
print(prediction)

quick fix, replace this line

model.fit(np.array([salary]), np.array([expenses]))

X is expected to be an array of arrays, array([arr1,arr2,array3,...]) same of arr1 and arr2 being arrays of at least one feature, same for y,it should be an array of containing a list of values array[label1,label2,label3,...]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM