简体   繁体   English

在训练情感分析模型时如何修复 ValueError?

[英]How can I fix a ValueError when training a model for sentiment analysis?

I am trying to train a model for logistic regression for a sentiment analysis.我正在尝试为情绪分析训练逻辑回归模型。 I get the following error when trying to standardize features and when trying to train the model:尝试标准化功能和尝试训练模型时出现以下错误:

I have posted the full traceback here我在这里发布了完整的追溯

ValueError                                Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_18368/1468496602.py in <module>
----> 1 model = logistic_regression.fit(features, target)

~\anaconda3\anacondadownload\lib\site-packages\sklearn\linear_model\_logistic.py in fit(self, X, y, sample_weight)
   1342             _dtype = [np.float64, np.float32]
   1343 
-> 1344         X, y = self._validate_data(X, y, accept_sparse='csr', dtype=_dtype,
   1345                                    order="C",
   1346                                    accept_large_sparse=solver != 'liblinear')

~\anaconda3\anacondadownload\lib\site-packages\sklearn\base.py in _validate_data(self, X, y, reset, validate_separately, **check_params)
    431                 y = check_array(y, **check_y_params)
    432             else:
--> 433                 X, y = check_X_y(X, y, **check_params)
    434             out = X, y
    435 

~\anaconda3\anacondadownload\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
     61             extra_args = len(args) - len(all_args)
     62             if extra_args <= 0:
---> 63                 return f(*args, **kwargs)
     64 
     65             # extra_args > 0

~\anaconda3\anacondadownload\lib\site-packages\sklearn\utils\validation.py in check_X_y(X, y, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, multi_output, ensure_min_samples, ensure_min_features, y_numeric, estimator)
    869         raise ValueError("y cannot be None")
    870 
--> 871     X = check_array(X, accept_sparse=accept_sparse,
    872                     accept_large_sparse=accept_large_sparse,
    873                     dtype=dtype, order=order, copy=copy,

~\anaconda3\anacondadownload\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
     61             extra_args = len(args) - len(all_args)
     62             if extra_args <= 0:
---> 63                 return f(*args, **kwargs)
     64 
     65             # extra_args > 0

~\anaconda3\anacondadownload\lib\site-packages\sklearn\utils\validation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, estimator)
    671                     array = array.astype(dtype, casting="unsafe", copy=False)
    672                 else:
--> 673                     array = np.asarray(array, order=order, dtype=dtype)
    674             except ComplexWarning as complex_warning:
    675                 raise ValueError("Complex data not supported\n"

~\anaconda3\anacondadownload\lib\site-packages\numpy\core\_asarray.py in asarray(a, dtype, order, like)
    100         return _asarray_with_like(a, dtype=dtype, order=order, like=like)
    101 
--> 102     return array(a, dtype, copy=False, order=order)
    103 
    104 

~\anaconda3\anacondadownload\lib\site-packages\pandas\core\series.py in __array__(self, dtype)
    855               dtype='datetime64[ns]')
    856         """
--> 857         return np.asarray(self._values, dtype)
    858 
    859     # ----------------------------------------------------------------------

~\anaconda3\anacondadownload\lib\site-packages\numpy\core\_asarray.py in asarray(a, dtype, order, like)
    100         return _asarray_with_like(a, dtype=dtype, order=order, like=like)
    101 
--> 102     return array(a, dtype, copy=False, order=order)
    103 
    104 

ValueError: could not convert string to float: 'clint eastwood return dirti harri calahan movi dirti harri seri clint older he still got harri told vacat troubl happen robberi memor make day catchphras come citi took vacat wors woman turn vigilant rape attack funfair start get punk one one last movi see sandra lock clint eastwood movi improv enforc bit comedi less seriou clint eastwood sunglass gargoyl best known sunglass worn arnold shwartzeneg termin worth watch like clint eastwood dirti harri film like action crime thriller'

​

I'm not sure how to fix this, if it needs to be deleted from the data?如果需要从数据中删除,我不确定如何解决这个问题? I have already done some text processing on this, like removing stop words, lower casing, removing punctuation.我已经对此进行了一些文本处理,例如删除停用词、小写字母、删除标点符号。

I have not converted any of the values to floats我没有将任何值转换为浮点数

May I ask what you convert the string to float for?请问您将字符串转换为浮点数是为了什么? You can refer to the document for the usage of float(). float()的用法可以参考文档。

As I know, they use word2vec to transfer the sentences to numerized sequences rather than float() in sentiment analysis.据我所知,他们在情感分析中使用 word2vec 将句子转换为数字序列,而不是 float()。 It would be nice if you can support more infomation.如果您可以支持更多信息,那就太好了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM