Python newby here.
I am developing a simple sequence search program for DNA sequences. The main idea is to get from the NCBI database the different sequences from an specific genome and start-end points. So far, I am able to do a simple search for one genome and one specific position: `
import urllib
genome="NC_009089.1"
start="359055"
end= "359070"
link = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=%s&rettype=fasta&seq_start=%s&seq_stop=%s" % (genome, start, end)
f = urllib.urlopen(link)
myfile = f.read()
print(myfile).splitlines()[1]
`
And this is the output I'm getting (the sequence in that position):
AGTAAAACGGTTTCCT
Now,I would like to find several sequences from different genomes and with different start-end points at the same time, returning all the sequences that were found. I´ve tried to import the data as a csv with the genomes in the first column, starts in the second and ends in the thirds, and then do a for loop with the open file, but since I´m not familiar with changing variables in URLs, I don´t know how to proceed.
Sorry if this is a naive question. Any help would be appreciated.
If you already have all the parameters in a file, you can iterate over that data and make requests like this (I use lists, because you don't show your code how you read your data from the file):
import urllib
url = 'https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi'
for genome, start, end in zip(genome_list, start_list, end_list):
data = {
'db': 'nuccore',
'rettype': 'fasta',
'genome': genome,
'start': start,
'end': end,
}
f = urllib.urlopen(url, data)
By passgin a dict
with the query params, urlopen()
takes care of encoding all the parameters properly (with =
and &
).
If urllib
is a bit complicated, you can try the python requests
library, which is much nicer to work with in my experience (but it is a third-party lib, not built-in).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.