简体   繁体   English

Sci-Kit学习:从CountVectorizer.fit_transofrm()(PYTHON)生成的矩阵中获取价值

[英]Sci-Kit Learn: getting value from matrix generated by CountVectorizer.fit_transofrm() (PYTHON)

My code is this one: 我的代码是这样的:

from sklearn.feature_extraction.text import CountVectorizer

count_vect = CountVectorizer()
new_text = ["with with hello hello hello house"]
X_new_counts = count_vect.fit_transform(new_text)


i = count_vect.vocabulary_.get('hello')
print(X_new_counts.shape)
c = X_new_counts.getcol(0)
print(c)

The matrix generated by X_new_counts = count_vect.fit_transform(new_text) has got this shape: (1, 3) X_new_counts = count_vect.fit_transform(new_text)生成的矩阵具有以下形状:(1,3)

with i = count_vect.vocabulary_.get('hello') , i get the index in the vocabulary of hello . 通过i = count_vect.vocabulary_.get('hello') ,我得到了hello词汇表中的索引。

My goal is getting the value from this matrix of the index relative count. 我的目标是从索引相对计数的矩阵中获取值。 How i can? 我怎么能? If I type: 如果输入:

value = X_new_counts.getcol(i)

it returns: 它返回:

(0, 0) 3 (0,0)3

where "3" is the correct value, but i don't want (0,0). 其中“ 3”是正确的值,但我不想要(0,0)。 So, how can i get only this value from the matrix? 所以,我怎么能只从矩阵中得到这个值呢?

X_new_counts是一个(稀疏)矩阵,因此您可以使用以下方法在i,j中获取值:

X_new_counts[i, j]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM