I have something like this,
tr|F2EF46|F2EF46_HORVD 210753
sp|K7W3E0|K7W3E0_MAIZE 21032
I need to print in a separate file only ID's inside | |,
F2EF46
K7W3E0
This script finds the pattern, but how to print only the ID's?
import re
o=open('result.txt','w')
with open('input.txt','rb') as f:
for line in f:
if re.findall(r'([a-z][a-z])(\|[a-z0-9]*.*)\|', line):
line = line.strip()
line = line.rstrip()
line = re.sub('(\|[a-z0-9]*.*)\|', '', line)
line = re.sub('\|', '', line)
query_id = line
f.write(query_id+'\n')
o.write(line)
You don't need regular expressions here:
id = line.split('|')[1])
Although if you really want to use regexes then you could do:
id = re.search('(\|)(.*?)(\|)', line).group(2)
Only don't use id
as a variable name, it is a built-in function and you are overriding it.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.