简体   繁体   中英

Why doesn't Beautifulsoup return the required items from this page?

im following the tutorials, but whatever i ask for, like menuitem = page_soup.findAll("h5") it keeps returning nothing, but i know they exist, i can see them, yet it finds nothing im doing exactly as the tutorials do yet it keeps saying there's nothing, im trying to pull from a plant site and find the name of the plant in my language which is displayed and visible on the page example: https://identify.plantnet.org/observation/weurope/1007256673

im trying to get a single word from that page and it seems impossible as the soup keeps saying things don't exist when they do help is appreciated

The data is loaded dynamically from their API in Json format, so BeautifulSoup doesn't see it. But you can use requests module to load it:

import json
import requests


url = 'https://identify.plantnet.org/observation/weurope/1007256673'

headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0'}
api_url = 'https://api.plantnet.org/v1/projects/weurope/observations/{plant_id}?lang=en'
plant_id = url.split('/')[-1]

data = requests.get(api_url.format(plant_id=plant_id), headers=headers).json()

# uncomment this to print all data:
# print(json.dumps(data, indent=4))

# print some data to screen:
print('{} - {}'.format(data['submittedName'], data['species']['commonNames'][0]))

Prints:

Solanum dulcamara L. - Bittersweet

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM