![](/img/trans.png)
[英]How to create a dictionary from 2 lists and other dictionary in Python?
[英]How to create dictionary with the name being a word from other array in python?
當文件中有新單詞時,我想創建字典,以將文件名和位置存儲在該單詞的字典中。
例如:
file1="This is apple"
file2="This is mango"
字典像:
this={file1:0,file2:0}
is={file1:5,file2:5}
apple={file1:8}
mango={file2:8}
我的代碼檢索單詞:
files=['sample1.txt']
for filename in files:
file = open(filename, 'r')
dict={}
for line in file:
for word in line.split():
word_name=word
if((word_name not in dict.keys())):
word={} # here the different dictionaries should be created
dict[word_name]=0
dict[word_name]+=1
在這里,“字典”字典存儲單詞和出現次數。
有什么建議么 ?
如果需要這種結構{word : {file1: count1, file2: count2}}
。
file1="This is apple"
file2="This is mango"
# you can read from a file incrementally and update the Counter
from collections import Counter
c1 = Counter(file1.split())
c2 = Counter(file2.split())
# do a dict comp
result = {i:{"file1": c1[i], "file2": c2[i]} for i in c1.keys() | c2.keys()}
# see if it worked
In[440]: result
Out[440]:
{'This': {'file1': 1, 'file2': 1},
'apple': {'file1': 1, 'file2': 0},
'is': {'file1': 1, 'file2': 1},
'mango': {'file1': 0, 'file2': 1}}
如果您想要這種結構{word : {file1: [pos1, pos2...], file2: [pos1, pos2...]}}
。
import re
from collections import defaultdict
result = defaultdict(lambda: {"file1": [], "file2": []})
for name, f in zip(["file1", "file2"], [file1, file2]):
ps = [match.start() for match in re.finditer(r"\b\S+\b", f)]
for word, p in zip(f.split(), ps):
result[word][name].append(p)
In [489]: dict(result)
Out[489]:
{'This': {'file1': [0], 'file2': [0]},
'apple': {'file1': [8], 'file2': []},
'is': {'file1': [5], 'file2': [5]},
'mango': {'file1': [], 'file2': [8]}}
我不相信有一種方法可以為每個輸入在每個單詞之后命名實際的字典,但是此代碼應提供所需的輸出(由於是字典,因此未排序)
import re
files="sample1.txt"
handle = open(files)
wordlist=[]
filenum={}
for line in handle:
line = line.rstrip()
if not line.startswith("file"):
continue
sent = re.findall('"([^"]*)"',line) #regexp to capture text between quotations
filenum[(line[:line.find("=")])]=sent[0] #store file numbers (file1, file2) in dictionary with sentence as value
words=sent[0].split(" ") #collect words in sentence
for word in words:
if word not in wordlist: #only add words not already added
wordlist.append(word)
x=0
for word in wordlist:
wordpos=dict()
for k,v in filenum.items():
if v.find(word)!=-1:
wordpos[k]=v.find(word, x)
if (x+len(word)+1)<len(v):
x=x+len(word)+1
print word+"="
print wordpos
這應該產生:
This={'file2': 0, 'file1': 0}
is={'file2': 5, 'file1': 5}
apple={'file1': 8}
mango={'file2': 8}
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.