Why doesn't Beautifulsoup return the required items from this page?

Question

im following the tutorials, but whatever i ask for, like menuitem = page_soup.findAll("h5") it keeps returning nothing, but i know they exist, i can see them, yet it finds nothing im doing exactly as the tutorials do yet it keeps saying there's nothing, im trying to pull from a plant site and find the name of the plant in my language which is displayed and visible on the page example: https://identify.plantnet.org/observation/weurope/1007256673

im trying to get a single word from that page and it seems impossible as the soup keeps saying things don't exist when they do help is appreciated

Answer 1

The data is loaded dynamically from their API in Json format, so BeautifulSoup doesn't see it. But you can use requests module to load it:

import json
import requests


url = 'https://identify.plantnet.org/observation/weurope/1007256673'

headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:77.0) Gecko/20100101 Firefox/77.0'}
api_url = 'https://api.plantnet.org/v1/projects/weurope/observations/{plant_id}?lang=en'
plant_id = url.split('/')[-1]

data = requests.get(api_url.format(plant_id=plant_id), headers=headers).json()

# uncomment this to print all data:
# print(json.dumps(data, indent=4))

# print some data to screen:
print('{} - {}'.format(data['submittedName'], data['species']['commonNames'][0]))

Prints:

Solanum dulcamara L. - Bittersweet

Why doesn't Beautifulsoup return the required items from this page?

Question

1 answers

solution1
0 ACCPTED 2020-06-23 21:04:31

Why doesn't Beautifulsoup return the required items from this page?

Question

1 answers

solution1 0 ACCPTED 2020-06-23 21:04:31

solution1
0 ACCPTED 2020-06-23 21:04:31