[英]“ValueError: Lengths must match to compare” for chi2 from sklearn.feature_selection
I am trying to run the following but running into an error: ValueError: Lengths must match to compare
我正在尝试运行以下内容,但遇到错误:
ValueError: Lengths must match to compare
from sklearn.feature_selection import chi2
import numpy as np
N = 2
for Product, category_id in sorted(category_to_id.items()):
features_chi2 = chi2(features, labels == category_id)
indices = np.argsort(features_chi2[0])
feature_names = np.array(tfidf.get_feature_names())[indices]
unigrams = [v for v in feature_names if len(v.split(' ')) == 1]
bigrams = [v for v in feature_names if len(v.split(' ')) == 2]
print("# '{}':".format(Product))
print(" . Most correlated unigrams:\n . {}".format('\n . '.join(unigrams[-N:])))
print(" . Most correlated bigrams:\n . {}".format('\n . '.join(bigrams[-N:])))
The code is from https://towardsdatascience.com/multi-class-text-classification-with-scikit-learn-12f1e60e0a9f 该代码来自https://towardsdatascience.com/multi-class-text-classification-with-scikit-learn-12f1e60e0a9f
Output is: 输出为:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-45-bbfd1a1f6a1a> in <module>()
3 N = 2
4 for Product, category_id in sorted(category_to_id.items()):
----> 5 features_chi2 = chi2(features, labels == category)
6 indices = np.argsort(features_chi2[0])
7 feature_names = np.array(tfidf.get_feature_names())[indices]
C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\ops.py in wrapper(self, other, axis)
1221 # as it will broadcast
1222 if other.ndim != 0 and len(self) != len(other):
-> 1223 raise ValueError('Lengths must match to compare')
1224
1225 res_values = na_op(self.values, np.asarray(other))
ValueError: Lengths must match to compare
len(features)
and len(labels)
prints the same counts. len(features)
和len(labels)
打印相同的计数。
Your traceback has labels == category
in line 5. But in the code you have labels == category_id
. 你的追溯有
labels == category
在第5行但在你的代码有labels == category_id
。 So this is probably the source of your error. 因此,这可能是错误的根源。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.