简体   繁体   中英

Linear Regression of Filtered Data Set

After finally working out my data set and being able to graph it, I have been trying to use linear regression to fit the curve. I have tried a few methods but none have given me any results, I think it is due to how my data has been filtered. Here is my code:

from matplotlib import pyplot as plt
import numpy as np
from pandas import DataFrame
from sklearn.linear_model import LinearRegression
from matplotlib.pyplot import figure

figure(num=None, figsize=(100, 100), dpi=100, facecolor='w', edgecolor='k')

plt.rc('font', size=100)          # controls default text sizes
plt.rc('axes', titlesize=100)     # fontsize of the axes title
plt.rc('axes', labelsize=100)    # fontsize of the x and y labels
plt.rc('xtick', labelsize=30)    # fontsize of the tick labels
plt.rc('ytick', labelsize=60)    # fontsize of the tick labels
plt.rc('legend', fontsize=100)    # legend fontsize
plt.rc('figure', titlesize=100)

plt.xticks(rotation=90)


ds = pd.read_csv("https://covid.ourworldindata.org/data/owid-covid-data.csv")
df = DataFrame(ds, columns = ['date', 'location', 'new_deaths', 'total_deaths'])

df = df.replace(np.nan, 0)

US = df.loc[df['location'] == 'United States']


plt.plot_date(US['date'],US['new_deaths'], 'blue', label = 'US', linewidth = 5)
#plt.plot_date(US['date'],US['total_deaths'], 'red', label = 'US', linewidth = 5)

#linear_regressor = LinearRegression()  # create object for the class
#linear_regressor.fit(US['date'], US['new_deaths'])  # perform linear regression
#Y_pred = linear_regressor.predict(X)  # make predictions

#m , b = np.polyfit(x = US['date'], y = US['new_deaths'], deg = 1)






plt.title('New Deaths per Day In US')
plt.xlabel('Time')
plt.ylabel('New Deaths')
plt.legend()
plt.grid()
plt.show()


I know this question has been asking thousands of times, so if there's a post out there that I didn't come across link it to me please. Thank you all: :D

With sklearn's LinearRegression, you can do this to fit the regression:

regr = LinearRegression()
regr.fit(US['date'].values.reshape(-1, 1), US['new_deaths'])

To plot it:

# plot the original points
plt.plt(US['date'], US['new_deaths'])

# plot the fitted line. To do so, first generate an input set containing
# only the max and min limits of the x range
trendline_x = np.array([US['date'].min(), US['date'].max()]).reshape(-1, 1)
# predict the y values of these two points
trendline_y = regr.predict(trendline_x)
# plot the trendline
plt.plot(trendline_x, trendline_y)

If you are only after the visual, Seaborn's lmplot is a handy and nice-looking alternative.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM