I have a list of strings as:
A = [
'philadelphia court excessive disappointed court hope hope',
'hope hope jurisdiction obscures acquittal court',
'mention hope maryland signal held mention problem internal reform life bolster level grievance'
]
and another list as:
B = ['court', 'hope', 'mention', 'life', 'bolster', 'internal', 'level']
I want to create dictionary based on occurrence counts of list words B
in list of strings A
. Something like,
C = [
{'count':2,'hope':2,'mention':0,'life':0,'bolster':0,'internal':0,'level':0},
{'count':1,'hope':2,'mention':0,'life':0,'bolster':0,'internal':0,'level':0},
{'count':0,'hope':1,'mention':2,'life':1,'bolster':1,'internal':1,'level':1}
]
What I did like,
dic={}
for i in A:
t=i.split()
for j in B:
dic[j]=t.count(j)
But,it returns only last pair of dictionary,
print (dic)
{'court': 0,
'hope': 1,
'mention': 2,
'life': 1,
'bolster': 1,
'internal': 1,
'level': 1}
Instead of creating a list of dicts as in your example output, you are only creating a single dict (and overwriting the word counts each time you check a phrase). You could use re.findall
to count the word occurrences in each phrase (which has the benefit of not failing if any of your phrases contain words followed by punctuation such as "hope?").
import re
words = ['court', 'hope', 'mention', 'life', 'bolster', 'internal', 'level']
phrases = ['philadelphia court excessive disappointed court hope hope','hope hope jurisdiction obscures acquittal court','mention hope maryland signal held mention problem internal reform life bolster level grievance']
counts = [{w: len(re.findall(r'\b{}\b'.format(w), p)) for w in words} for p in phrases]
print(counts)
# [{'court': 2, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0}, {'court': 1, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0}, {'court': 0, 'hope': 1, 'mention': 2, 'life': 1, 'bolster': 1, 'internal': 1, 'level': 1}]
Two issues: You are initializing the dic
at the wrong place and not collecting those dic
s in a list. Here is the fix:
C = []
for i in A:
dic = {}
t=i.split()
for j in B:
dic[j]=t.count(j)
C.append(dic)
# Result:
[{'court': 2, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0},
{'court': 1, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0},
{'court': 0, 'hope': 1, 'mention': 2, 'life': 1, 'bolster': 1, 'internal': 1, 'level': 1}]
You always overwrite the existing values in dict dic
with dict[j]=t.count(j)
. You could create a new dict for every i and append it to a list like:
dic=[]
for i in A:
i_dict = {}
t=i.split()
for j in B:
i_dict[j]=t.count(j)
dic.append(i_dict)
print(dic)
To avoid overwriting existing values, check if the entry is already in the dictionary. Try adding:
if j in b:
dic[j] += t.count(j)
else:
dic[j] = t.count(j)
Try this,
from collections import Counter
A = ['philadelphia court excessive disappointed court hope hope',
'hope hope jurisdiction obscures acquittal court',
'mention hope maryland signal held mention problem internal reform life bolster level grievance']
B = ['court', 'hope', 'mention', 'life', 'bolster', 'internal', 'level']
result = [{b: dict(Counter(i.split())).get(b, 0) for b in B} for i in A]
print(result)
output:
[{'court': 2, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0}, {'court': 1, 'hope': 2, 'mention': 0, 'life': 0, 'bolster': 0, 'internal': 0, 'level': 0}, {'court': 0, 'hope': 1, 'mention': 2, 'life': 1, 'bolster': 1, 'internal': 1, 'level': 1}]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.