I've looked at quite a few posts but none seem to help.
I want to calcuate Term Frequency & Inverse Document Frequency; a Bag of Words technique used in Deep Learning. The purpose of this code is just to calculate the formula. I do not implement an ANN here.
Below is a minimal code example. It is after the for loop I have this problem.
import math
docs = 1000
words_per_doc = 100 # length of doc
#word_freq = 10
#doc_freq = 100
dp = 4
print('Term Frequency Inverse Document Frequency')
# term, word_freq, doc_freq
words = [['the', 10, 100], ['python', 10, 900]]
tfidf_ = []
for idx, val in enumerate(words):
print(words[idx][0] + ':')
word_freq = words[idx][1]
doc_freq = words[idx][2]
tf = round(word_freq/words_per_doc, dp)
idf = round(math.log10(docs/doc_freq), dp)
tfidf = round((tf*idf), dp)
print(str(tf) + ' * ' + str(idf) + ' = ' + str(tfidf))
tfidf_.append(tfidf)
print()
max_val = max(tfidf)
max_idx = tfidf.index(max_val)
#max_idx = tfidf.index(max(tfidf))
lowest_idx = 1 - max_idx
print('Therefore, \'' + words[max_idx][0] + '\' semantically is more important than \'' + words[lowest_idx][0] + '\'.')
#print('log(N/|{d∈D:w∈W}|)')
Error:
line 25, in <module>
max_val = max(tfidf)
TypeError: 'float' object is not iterable
You are trying to pass tfidf on your function instead of tfidf_
tfidf is int and tfidf_ is your list
So code should be
max_val = max(tfidf_)
max_idx = tfidf_.index(max_val)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.