简体   繁体   中英

Negative confidence interval in linear regression despite all positive values

I am getting a negative confidence interval for a linear regression plot even though all data points are positive. Why is this happening? I believe this negative confidence interval will also affect my R^2 score?

Code used is:

    sns.regplot(x = 'Consumer Confidence Index_1', y = 'Sales (ALV
sources)', data = df_mx2)

plt.show()

See graph pic here

One of the foundational assumptions for a linear regression is that the data is normally distributed about the line. In your case you have data on the right side and the left side with a big gap in the middle. As such, you should double check that a linear regression is appropriate for your analysis.

That being said, rest easy, the negative confidence interval will NOT effect your R² value.

The reason for the negative confidence interval has to do with the sparsity of data with x<42. If the three points on the right side were removed, the regression would have a positive slope intersecting the x axis around x=42. If that line were extended to x=30 or so it would be very negative. As such the data suggests that to hit the confidence threshold you have set, the confidence interval must be very large to include data that potentially lines up with the steeper regression line.

This can be interpreted as the data provides very little in the way of predictive ability below x=42.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM