the code:
import bs4
import sys
from bs4 import BeautifulSoup
import requests
url = requests.get("https://www.ncbi.nlm.nih.gov/nuccore/{FASTA}?report=fasta".format(FASTA="NM_213035.1"))
url.raise_for_status()
ncbi = bs4.BeautifulSoup(url.text, "html.parser")
filename = ncbi.title.text
with open(filename, 'w+') as f:
for i in ncbi.select('p'):
f.write(i.getText())
the output:
Warning: The NCBI web site requires JavaScript to function. more... Download features.Download gene features.NCBI Reference Sequence: NM_213035.1
GenBank Graphics
Whole sequence
Selected region
from:
to:
Show reverse complement
Show gap features Your browsing activity is empty.Activity recording is turned off. Turn recording back on
National Center for Biotechnology Information, US National Library of Medicine
8600 Rockville Pike, Bethesda MD, 20894 USA
You are not using the correct URL to fetch FASTA files via the REST API. As @Ghoti pointed out, the correct URLs are described here: https://www.ncbi.nlm.nih.gov/books/NBK25497/
For you specific problem this would be:
https://eutils.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi?db=nuccore&id=NM_213035.1&rettype=fasta&retmode=text
If you are using Python, you could use Biotite for this task, a package I am developing: https://www.biotite-python.org/apidoc/biotite.database.entrez.fetch.html
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.