[英]look up dictionary in python
所以我有多個行的文件,看起來像這樣(空格分隔符文件):
A1BG P04217 VAR_018369 p.His52Arg Polymorphism rs893184 -
A1BG P04217 VAR_018370 p.His395Arg Polymorphism rs2241788 -
AAAS Q9NRG9 VAR_012804 p.Gln15Lys Disease - Achalasia
如何使字典在第二列中查找id並在第四列中存儲數字(在單詞之間)。
我嘗試了這個,但它給了我一個范圍的索引
lookup = defaultdict(list)
with open ('humsavar.txt', 'r') as humsavarTxt:
for line in csv.reader(humsavarTxt):
code = re.match('[a-z](\d+)[a-z]', line[1], re.I)
if code:
lookup[line[-2]].append(code.group(1))
print lookup['P04217']
這是原始代碼的變體:
import csv, re
from collections import defaultdict
lookup = defaultdict(list)
with open('humsavar.txt', 'rb') as humsavarTxt:
reader = csv.reader(humsavarTxt, delimiter=" ", skipinitialspace=True)
for line in reader:
code = re.search(r'(\d+)', line[3])
lookup[line[1]].append(int(code.group(1)))
哪個產生
>>> lookup
defaultdict(<type 'list'>, {'P04217': [52, 395], 'Q9NRG9': [15]})
>>> lookup['P04217']
[52, 395]
如果id和數字始終在第二和第四列中,並且它始終以空格分隔,則不需要使用常規表達式。 您可以拆分空格:
lookup = defaultdict(list)
with open ('humsavar.txt', 'r') as humsavarTxt:
for line in humsavarTxt:
lookup[line.split(' ')[1]].append(line.split(' ')[3])
如果你想要一個純字典,這有效:
d={}
with open(your_file,'rb') as f:
for line in f:
l=line.split()
num=int(re.search(r'(\d+)',l[3]).group(1))
d.setdefault(l[1],[]).append(num)
打印:
{'P04217': [52, 395], 'Q9NRG9': [15]}
對於非正則表達式解決方案,您還可以執行以下操作:
d={}
with open(your_file,'rb') as f:
for line in f:
els=line.split()
num=int(''.join(c for c in els[3] if c.isdigit()))
d.setdefault(els[1],[]).append(num)
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.