简体   繁体   中英

Parsing XML object in python 3.9

I'm trying to get some data using the NCBI API. I am using requests to make the connection to the API.

What I'm stuck on is how do I convert the XML object that requests returns into something that I can parse?

Here's my code for the function so far:

def getNCBIid(speciesName):
    import requests
    
    base_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/"
    
    url = base_url + "esearch.fcgi?db=assembly&term=(%s[All Fields])&usehistory=y&api_key=f1e800ad255b055a691c7cf57a576fe4da08" % speciesName
    
    #xml object
    api_request = requests.get(url)

You would use something like BeautifulSoup for this ('this' being 'convert and parse the xml object').

What you are calling your xml object is still the response object, and you need to extract the content from that object first.

from bs4 import BeautifulSoup

def getNCBIid(speciesName):
    import requests
    
    base_url = "https://eutils.ncbi.nlm.nih.gov/entrez/eutils/"
    
    url = base_url + "esearch.fcgi?db=assembly&term=(%s[All Fields])&usehistory=y&api_key=f1e800ad255b055a691c7cf57a576fe4da08" % speciesName
    
    #xml object. <--- this is still just your response object
    api_request = requests.get(url)
     
    # grab the response content 
    xml_content = api_request.content
    
    # parse with beautiful soup        
    soup = BeautifulSoup(xml_content, 'xml')

    # from here you would access desired elements 
    # here are docs: https://www.crummy.com/software/BeautifulSoup/bs4/doc/

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM