I have a dict which consists
docs[infile]={'tf':{}, 'idf':{},'words':[], 'tf_idf':{}}
and I have a list that I want to pass some of the dict's items
the sub-dicts tf_idf AND idf contain data such as {(word, number),(word, number),...}
I need to store in the list both tf_idf and idf items. This code stores only one of those 2 sub-dicts.
templist=[]
for key in docs: #stores data in separate list
TF_IDF_buffer = docs[key]['tf_idf'].items()
templist.append(TF_IDF_buffer)
Is it possible to store both of them in the list ?
This joins the two sequences of items, keeping duplicated keys:
templist=[]
for key, value in docs.items():
tf_idf = list(value['tf_idf'].items())
idf = list(value['idf'].items())
templist.append(tf_idf + idf)
I think something like this should be what you are looking for
templist=[]
for key in docs: #stores data in separate list
for word in docs[key]['words']:
idf = docs[key]['idf']
tf_idf = docs[key]['tf_idf']
temp_list.append((word, tf_idf, idf))
However, I also saw some of your other questions on this forum. I think your structure of nested lists and dicts is somewhat complicated. For instance, your list of words, is duplicated by the keys in idf
and tf_idf
.
You may want to consider using a more Object Oriented approach.
You could define a class like this:
class Document:
def __init__(self, words, idf, tf_idf):
self.words = words
self.idf = idf
self.tf_idf = tf_idf
Also, from my memory of using NLP, I remember that using collections.defaultdict
can be quite useful (especially if your idf
and tf_idf
dictionaries are sparse).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.