![](/img/trans.png)
[英]Showing “TypeError: unhashable type: 'list'” while using it in nltk.FreqDist() function
[英]TypeError: unhashable type: 'list' for nltk.FreqDist even though I've converted my list into a tuple
我確實有以下代碼:
import nltk
grams = tuple(i for i in tri_grams)
print(type(grams))
bigram_fd = nltk.FreqDist(grams)
bigram_fd.most_common()
並出現以下錯誤
<class 'tuple'>
TypeError Traceback (most recent call last)
<ipython-input-200-4809d6a29102> in <module>
3 grams = tuple(i for i in tri_grams)
4 print(type(grams))
----> 5 bigram_fd = nltk.FreqDist(grams)
6 # bigram_fd = nltk.FreqDist(nltk.bigrams(ngrams))
7
c:\Users\Nauel\AppData\Local\Programs\Python\Python36\lib\site-packages\nltk\probability.py in __init__(self, samples)
100 :type samples: Sequence
101 """
--> 102 Counter.__init__(self, samples)
103
104 # Cached number of samples in this FreqDist
c:\Users\Nauel\AppData\Local\Programs\Python\Python36\lib\collections\__init__.py in __init__(*args, **kwds)
533 raise TypeError('expected at most 1 arguments, got %d' % len(args))
534 super(Counter, self).__init__()
--> 535 self.update(*args, **kwds)
536
537 def __missing__(self, key):
c:\Users\Nauel\AppData\Local\Programs\Python\Python36\lib\site-packages\nltk\probability.py in update(self, *args, **kwargs)
138 """
...
--> 622 _count_elements(self, iterable)
623 if kwds:
624 self.update(kwds)
TypeError: unhashable type: 'list'
那么我的代碼有什么問題? 我已將列表轉換為元組,但FreqDist
無法識別它。 我希望我已經清楚了,謝謝! :)
PS = 我的tri_grams
看起來像這樣:
[['potere_crescere', 'molto_vs', 'decentraland_mano', 'can_grow', 'lot_vs'], ['potere_crescere', 'molto_vs', 'decentraland_mano', 'can_grow', 'lot_vs'], ['certo', 'no', 'essere', 'sempre', 'gente', 'innocente', 'pagare', 'prezzo', 'storia', 'Balcani', 'essere', 'molto', 'complesso', 'essere', 'incrocio', 'interesse', 'misto', 'cultura', 'nazione', 'religione', 'gente', 'testardo', 'orgoglioso', 'difficile', 'gestire']]
我完全改變了這樣做的方式,現在它正在工作,它可能對使用 n-gram 的人有用:
bi_grams = []
bigram = gensim.models.phrases.Phrases(df['stopwords'], min_count=1, threshold=10)
vector = bigram[df['stopwords']]
for t in vector:
bi_grams.append(t)
frequencies = []
pattern = re.compile(r'_')
item = str("_")
for i in range(len(bi_grams)):
for j in range(len(bi_grams[i])):
if item in bi_grams[i][j]:
frequencies.append(bi_grams[i][j])
from collections import Counter
frequencies = str(frequencies)
split_it = frequencies.split(", ")
Counter = Counter(split_it)
most_occur = Counter.most_common(100)
most_occur
然后輸出:
[("'offerta_invece'", 78),
("'️_recensione'", 60),
("'prezzo_precedente'", 51),
("'prezzo_attuale'", 50),
("'stare_risparmiare'", 49),
("'prezzo_scontare'", 36),
("'️offertare_amazon️'", 31),
("'offerta_sconto'", 30),
("'risparmio_acquistare'", 30),
("'ora_storico'", 30),
("'solo_invece'", 26),
("'prezzo_ridurre'", 23),
("' re_attenzione'", 22),
...
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.