I am trying to train a model for logistic regression for a sentiment analysis. I get the following error when trying to standardize features and when trying to train the model:
I have posted the full traceback here
ValueError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_18368/1468496602.py in <module>
----> 1 model = logistic_regression.fit(features, target)
~\anaconda3\anacondadownload\lib\site-packages\sklearn\linear_model\_logistic.py in fit(self, X, y, sample_weight)
1342 _dtype = [np.float64, np.float32]
1343
-> 1344 X, y = self._validate_data(X, y, accept_sparse='csr', dtype=_dtype,
1345 order="C",
1346 accept_large_sparse=solver != 'liblinear')
~\anaconda3\anacondadownload\lib\site-packages\sklearn\base.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
431 y = check_array(y, **check_y_params)
432 else:
--> 433 X, y = check_X_y(X, y, **check_params)
434 out = X, y
435
~\anaconda3\anacondadownload\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
---> 63 return f(*args, **kwargs)
64
65 # extra_args > 0
~\anaconda3\anacondadownload\lib\site-packages\sklearn\utils\validation.py in check_X_y(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, estimator)
869 raise ValueError("y cannot be None")
870
--> 871 X = check_array(X, accept_sparse=accept_sparse,
872 accept_large_sparse=accept_large_sparse,
873 dtype=dtype, order=order, copy=copy,
~\anaconda3\anacondadownload\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
---> 63 return f(*args, **kwargs)
64
65 # extra_args > 0
~\anaconda3\anacondadownload\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
671 array = array.astype(dtype, casting="unsafe", copy=False)
672 else:
--> 673 array = np.asarray(array, order=order, dtype=dtype)
674 except ComplexWarning as complex_warning:
675 raise ValueError("Complex data not supported\n"
~\anaconda3\anacondadownload\lib\site-packages\numpy\core\_asarray.py in asarray(a, dtype, order, like)
100 return _asarray_with_like(a, dtype=dtype, order=order, like=like)
101
--> 102 return array(a, dtype, copy=False, order=order)
103
104
~\anaconda3\anacondadownload\lib\site-packages\pandas\core\series.py in __array__(self, dtype)
855 dtype='datetime64[ns]')
856 """
--> 857 return np.asarray(self._values, dtype)
858
859 # ----------------------------------------------------------------------
~\anaconda3\anacondadownload\lib\site-packages\numpy\core\_asarray.py in asarray(a, dtype, order, like)
100 return _asarray_with_like(a, dtype=dtype, order=order, like=like)
101
--> 102 return array(a, dtype, copy=False, order=order)
103
104
ValueError: could not convert string to float: 'clint eastwood return dirti harri calahan movi dirti harri seri clint older he still got harri told vacat troubl happen robberi memor make day catchphras come citi took vacat wors woman turn vigilant rape attack funfair start get punk one one last movi see sandra lock clint eastwood movi improv enforc bit comedi less seriou clint eastwood sunglass gargoyl best known sunglass worn arnold shwartzeneg termin worth watch like clint eastwood dirti harri film like action crime thriller'
I'm not sure how to fix this, if it needs to be deleted from the data? I have already done some text processing on this, like removing stop words, lower casing, removing punctuation.
I have not converted any of the values to floats
May I ask what you convert the string to float for? You can refer to the document for the usage of float().
As I know, they use word2vec to transfer the sentences to numerized sequences rather than float() in sentiment analysis. It would be nice if you can support more infomation.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.