简体   繁体   中英

Plotting in Python, (with numpy, pandas and matplotlib)

I am new to Python, albeit not to programming, and I've found on a tutorial this Python implementation of a simple linear regression algorithm. So far I've only written the code for plotting the graph, with no linear regression. However, as I try to run the code into the Python command prompt, I only get an empty plot. I've tried to correct possible errors with the tutorial and other resources on the.net, though I wasn't able to find anything. Could somebody help me figure this out? (PS. I am new to Stack Overflow:)) This is the code:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

data = {'Years of Experience':[ 1.1, 1.3, 1.5, 2. , 2.2, .29, 3. , 3.2, 3.2, 3.7, 3.7, 3.9,
    4. , 4. , 4.1, 4.5, 4.9, 5.1,  5.3,  5.9,  6. , 6.8, 7.1,
    7.9, 8.2, 8.7, 9. , 9.5, 9.6, 10.3, 10.5],
    'Salary':[ 39343., 46205., 37731., 43525., 39891., 56642., 60150.,
    54445., 64445., 57189., 63218., 55794., 56957., 57081.,
    61111.,  67938.,   66029.,  83088.,  81363.,  93940.,  91738.,
    98273., 1011302., 113812., 109431., 105582., 116969., 112635.,
   122391., 121872.]}
dataframe = pd.DataFrame(data)
dataframe.head()
x = dataframe.iloc[:,0].values.reshape(-1,1)
y = dataframe.iloc[:,1].values.reshape(-1,1)
plt.scatter(x,y)
plt.title("Years of Experience vs Salary")
plt.xlabel("Years of Experience")
plt.ylabel("Salary")
plt.show()

The length of data['Salary'] is 30, whereas the length of data['Years of Experience'] is 31, so when you go to try and create a dictionary out of it, you get an error saying:

ValueError: arrays must all be same length

Add another value to that list and it should plot just fine.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM