简体   繁体   中英

Sklearn Regression Fit error after matplotlib plotting

SOLVED

What appears to have been the issue is a conflict between Anaconda's installed packages and globally installed pip packages (I had Python 3.8 standalone installed). I am not 100% what exactly was the source of the conflict however after uninstalling both Python 3.8 and Anaconda (then reinstalling Anaconda with all the packages I need) the ValueError no longer occurred. Which leads me to believe that either sklearn or a dependency of it was installed globally with pip when I installed a package with pip (accidentally globally) and this package conflicted with the Anaconda version leading to the ValueError .


I was creating a regression plot in Python, fitting data with sklearn then plotting with matplotlib, within JupyterLab. The problem is that I would get a ValueError: illegal value in 4-th argument of internal None every other run of the cell.

So if I run the cell the first time, everything works as expected, then the second time I run it it will give that error (full error at the end, below code). Note that this only happens (at least on my end) when the length of the data (x array and y array) is 9 or longer in length (8 or less in length doesn't result in any error no matter how many times the cell is run).

Ideally I'll like to get his fixed, either through adding something to my code or if I need to update/downgrade a package. Below I'm listing, in the following order, my code cell, the error message, the package versions (installed with anaconda), and solution I tried but didn't work.

Note : The ValueError only occurs every other run (so first run runs as normal) if I rerun the cell or run another cell that's essentially the same but with different input that (x and y) right after, the ValueError will occur.

1)

import matplotlib.pyplot as plt
import numpy as np

from sklearn.linear_model import LinearRegression

data_length = 9  # ValueErrors occur at 9 or greater (8 or less doesn't produce any errors)

x = np.random.rand(data_length)
x_train = x[:, np.newaxis]

y = np.random.rand(data_length)

model = LinearRegression().fit(x_train, y)

plt.figure(figsize=(8, 6))
plt.title('Example Regression that Produces a ValueError Every Other Run')

plt.yticks(fontsize=14)
plt.xticks(fontsize=14)

plt.xlabel('M03A 28% SVR Activity (%)', fontsize=14)
plt.ylabel('Reference Activity (%)', fontsize=14)

ax = plt.gca()

color=next(ax._get_lines.prop_cycler)['color']

plt.plot(x, model.predict(x[:, np.newaxis]), label='Line of Best Fit', color=color)

plt.text(
    .05,
    .5,
    'y = {}x {} {}\n$R^2$ = {}'.format(
        round(model.coef_[0], 2), '-' if model.intercept_ < 0 else '+', abs(round(model.intercept_, 2)), round(model.score(x_train, y), 2)),
    bbox=dict(facecolor='white', edgecolor=color),
    color=color,
    transform=ax.transAxes,
)

color=next(ax._get_lines.prop_cycler)['color']

plt.plot(x, y, 'o', color=color, label='Data')

plt.xlim(0, np.max(x)*1.1)
plt.ylim(0, np.max(y)*1.1)

plt.legend()
plt.show()  # removing this makes no difference
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-40-c3d4bd1a69a4> in <module>
      6 y = np.random.rand(data_length)
      7 
----> 8 model = LinearRegression().fit(x_train, y)
      9 
     10 plt.figure(figsize=(8, 6))

~\Anaconda3\lib\site-packages\sklearn\linear_model\_base.py in fit(self, X, y, sample_weight)
    545         else:
    546             self.coef_, self._residues, self.rank_, self.singular_ = \
--> 547                 linalg.lstsq(X, y)
    548             self.coef_ = self.coef_.T
    549 

~\AppData\Roaming\Python\Python38\site-packages\scipy\linalg\basic.py in lstsq(a, b, cond, overwrite_a, overwrite_b, check_finite, lapack_driver)
   1223             raise LinAlgError("SVD did not converge in Linear Least Squares")
   1224         if info < 0:
-> 1225             raise ValueError('illegal value in %d-th argument of internal %s'
   1226                              % (-info, lapack_driver))
   1227         resids = np.asarray([], dtype=x.dtype)

ValueError: illegal value in 4-th argument of internal None
​```
matplotlib -> 3.3.1
numpy -> 1.19.1
scikit-learn -> 0.23.2
scipy -> 1.5.0

On this other stackoverflow page one of the comments mentioned the potential for broken scipy install, I tried force reinstalling it with conda with no effect.

Another (on the same page) mentioned plt.show() however including or excluded it from the code cell has no effect on the occurrence of the error.

I have exactly the same experience (using PyCharm and no Anaconda) - run the offending line again and it works!!


LinAlgError Traceback (most recent call last) in ----> 1 lm.fit(X_train,y_train)

~\\pycharmprojects\\jupyternotebooks\\venv\\lib\\site-packages\\sklearn\\linear_model_base.py in fit(self, X, y, sample_weight) 545 else: 546 self.coef_, self. residues, self.rank , self.singular_ =
--> 547 linalg.lstsq(X, y) 548 self.coef_ = self.coef_.T 549

~\\pycharmprojects\\jupyternotebooks\\venv\\lib\\site-packages\\scipy\\linalg\\basic.py in lstsq(a, b, cond, overwrite_a, overwrite_b, check_finite, lapack_driver) 1219 cond, False, False) 1220 if info > 0: -> 1221 raise LinAlgError("SVD did not converge in Linear Least Squares") 1222 if info < 0: 1223 raise ValueError('illegal value in %d-th argument of internal %s'

LinAlgError: SVD did not converge in Linear Least Squares

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM