I am trying to access the elements in the Ingredients list of the following website: https://www.jamieoliver.com/recipes/pasta-recipes/gennaro-s-classic-spaghetti-carbonara/
<div class="col-md-12 ingredient-wrapper">
<ul class="ingred-list ">
<li>
3 large free-range egg yolks
</li>
<li>
40 g Parmesan cheese, plus extra to serve
</li>
<li>
1 x 150 g piece of higher-welfare pancetta
</li>
<li>
200g dried spaghetti
</li>
<li>
1 clove of garlic
</li>
<li>
extra virgin olive oil
</li>
</ul>
</div
I first tried just using requests and beautiful soup but my code didn't find the list elements. I then tried using Selenium and it still didn't work. My code is below:
from selenium import webdriver
from bs4 import BeautifulSoup
import pandas as pd
url = "https://www.jamieoliver.com/recipes/pasta-recipes/cracker-ravioli/"
driver = webdriver.Chrome()
driver.get(url)
html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')
for ultag in soup.findAll('div', {'class': "col-md-12 ingredient-wrapper"}):
# for ultag in soup.findAll('ul', {'class': 'ingred_list '}):
for litag in ultag.findALL('li'):
print(litag.text)
To get the ingredients list, you can use this example:
import requests
from bs4 import BeautifulSoup
url = 'https://www.jamieoliver.com/recipes/pasta-recipes/gennaro-s-classic-spaghetti-carbonara/'
headers = {'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:80.0) Gecko/20100101 Firefox/80.0'}
soup = BeautifulSoup(requests.get(url, headers=headers).content, 'html.parser')
for li in soup.select('.ingred-list li'):
print(' '.join(li.text.split()))
Prints:
3 large free-range egg yolks
40 g Parmesan cheese , plus extra to serve
1 x 150 g piece of higher-welfare pancetta
200 g dried spaghetti
1 clove of garlic
extra virgin olive oil
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.