'generator' 類型的對象沒有 len()

Question

我剛剛開始學習python。 我想用 NLTK 編寫一個程序，將文本分解為 unigrams、bigrams。 例如，如果輸入文本是...

"I am feeling sad and disappointed due to errors"

...我的函數應該生成如下文本：

I am-->am feeling-->feeling sad-->sad and-->and disappointed-->disppointed due-->due to-->to errors

我已經編寫了將文本輸入到程序中的代碼。 這是我正在嘗試的功能：

def gen_bigrams(text):
    token = nltk.word_tokenize(review)
    bigrams = ngrams(token, 2)
    #print Counter(bigrams)
    bigram_list = ""
    for x in range(0, len(bigrams)):
        words = bigrams[x]
        bigram_list = bigram_list + words[0]+ " " + words[1]+"-->"
    return bigram_list

我得到的錯誤是...

for x in range(0, len(bigrams)):

TypeError: object of type 'generator' has no len()

由於ngrams函數返回一個生成器，我嘗試使用len(list(bigrams))但它返回 0 值，所以我得到了同樣的錯誤。 我已經提到了 StackExchange 上的其他問題，但我仍然沒有解決如何解決這個問題。 我被這個錯誤困住了。 任何解決方法，建議？

Answer 1

通過連接由分隔符分隔的值來構造字符串最好由str.join完成：

def gen_bigrams(text):
    token = nltk.word_tokenize(text)
    bigrams = nltk.ngrams(token, 2)
    # instead of " ".join also "{} {}".format would work in the map
    return "-->".join(map(" ".join, bigrams))

請注意，不會有尾隨的“-->”，因此如有必要，請添加它。 這樣你甚至不必考慮你正在使用的迭代的長度。 一般來說，在python中幾乎總是如此。 如果要遍歷一個可迭代對象，請使用for x in iterable: 。 如果確實需要索引，請使用enumerate ：

for i, x in enumerate(iterable):
    ...

Answer 2

bigrams 是一個生成器函數，而 bigrams.next() 是為您提供令牌元組的東西。 您可以在 bigrams.next() 上執行 len() 但不能在生成器函數上執行。 以下是更復雜的代碼來完成您想要實現的目標。

>>> review = "i am feeling sad and disappointed due to errors"
>>> token = nltk.word_tokenize(review)
>>> bigrams = nltk.ngrams(token, 2)
>>> output = ""
>>> try:
...   while True:
...     temp = bigrams.next()
...     output += "%s %s-->" % (temp[0], temp[1])
... except StopIteration:
...   pass
... 
>>> output
'i am-->am feeling-->feeling sad-->sad and-->and disappointed-->disappointed due-->due to-->to errors-->'
>>>

'generator' 類型的對象沒有 len()

問題描述

2 個解決方案

解決方案1
5 已采納 2016-04-28 12:56:26

解決方案2
1 2016-04-28 12:06:51

&#39;generator&#39; 類型的對象沒有 len()

問題描述

2 個解決方案

解決方案1 5 已采納 2016-04-28 12:56:26

解決方案2 1 2016-04-28 12:06:51

'generator' 類型的對象沒有 len()

解決方案1
5 已采納 2016-04-28 12:56:26

解決方案2
1 2016-04-28 12:06:51