简体   繁体   English

转换生成器表达式( <genexpr> )列出?

[英]Converting a generator expression (<genexpr>) to list?

I'm doing some topic modeling and am looking to store some of the results of my analysis. 我正在做一些主题建模,并希望存储一些分析结果。

import pandas as pd, numpy as np, scipy
import sklearn.feature_extraction.text as text
from sklearn import decomposition

descs = ["You should not go there", "We may go home later", "Why should we do your chores", "What should we do"]

vectorizer = text.CountVectorizer()

dtm = vectorizer.fit_transform(descs).toarray()

vocab = np.array(vectorizer.get_feature_names())

nmf = decomposition.NMF(3, random_state = 1)

topic = nmf.fit_transform(dtm)

topic_words = []

for t in nmf.components_:
    word_idx = np.argsort(t)[::-1][:20]
    topic_words.append(vocab[i] for i in word_idx)

for t in range(len(topic_words)):
    print("Topic {}: {}\n".format(t, " ".join([word for word in topic_words[t]])))

Prints: 印刷品:

Topic 0: do we should your why chores what you there not may later home go

Topic 1: should you there not go what do we your why may later home chores

Topic 2: we may later home go what do should your you why there not chores

I'm trying to write those topics to a file, so I thought storing them in a list might work, like this: 我试图将这些主题写到文件中,所以我认为将它们存储在列表中可能是可行的,如下所示:

l = []
for t in range(len(topic_words)):
    l.append([word for word in topic_words[t]])
    print("Topic {}: {}\n".format(t, " ".join([word for word in topic_words[t]])))

But l just ends up as an empty array. 但是l只是以一个空数组结尾。 How can I store these words in a list? 如何将这些单词存储在列表中?

You're appending generator expressions to your list topic_words , so the first time you printed, the generator expressions are already exhausted. 您将生成器表达式附加到列表topic_words ,因此,第一次打印时,生成器表达式已经用尽。 You can instead do: 您可以改为:

topic_words = []

for t in nmf.components_:
    word_idx = np.argsort(t)[::-1][:20]
    topic_words.append([vocab[i] for i in word_idx])
#                      ^                          ^

With this, you apparently won't need a new list, and you can print out with: 这样,您显然不需要新的列表,可以打印出:

for t, words in enumerate(topic_words, 1):
    print("Topic {}: {}\n".format(t, " ".join(words)))

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM