[英]Sci-Kit Learn: getting value from matrix generated by CountVectorizer.fit_transofrm() (PYTHON)
My code is this one: 我的代码是这样的:
from sklearn.feature_extraction.text import CountVectorizer
count_vect = CountVectorizer()
new_text = ["with with hello hello hello house"]
X_new_counts = count_vect.fit_transform(new_text)
i = count_vect.vocabulary_.get('hello')
print(X_new_counts.shape)
c = X_new_counts.getcol(0)
print(c)
The matrix generated by X_new_counts = count_vect.fit_transform(new_text)
has got this shape: (1, 3) 由
X_new_counts = count_vect.fit_transform(new_text)
生成的矩阵具有以下形状:(1,3)
with i = count_vect.vocabulary_.get('hello')
, i get the index in the vocabulary of hello . 通过
i = count_vect.vocabulary_.get('hello')
,我得到了hello词汇表中的索引。
My goal is getting the value from this matrix of the index relative count. 我的目标是从索引相对计数的矩阵中获取值。 How i can?
我怎么能? If I type:
如果输入:
value = X_new_counts.getcol(i)
it returns: 它返回:
(0, 0) 3
(0,0)3
where "3" is the correct value, but i don't want (0,0). 其中“ 3”是正确的值,但我不想要(0,0)。 So, how can i get only this value from the matrix?
所以,我怎么能只从矩阵中得到这个值呢?
X_new_counts是一个(稀疏)矩阵,因此您可以使用以下方法在i,j中获取值:
X_new_counts[i, j]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.