I am using beautiful soup to scrape some data from foodily.com
On above page there is a div
with class 'ings' and I want to get data within its p
tags for that I have written below code:
ingredients = soup.find('div', {"class": "ings"}).findChildren('p')
It provide me list of ingredient but with p
tags.
Call get_text()
for every p
element found inside the div
element with class="ings"
.
Complete working code:
from bs4 import BeautifulSoup
import requests
with requests.Session() as session:
session.headers.update({"User-Agent": "Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/52.0.2743.82 Safari/537.36"})
response = session.get("http://www.foodily.com/r/0y1ygzt3zf-perfect-vanilla-cupcakes-by-annie-s")
soup = BeautifulSoup(response.content, "html.parser")
ingredients = [ingredient.get_text() for ingredient in soup.select('div.ings p')]
print(ingredients)
Prints:
[
u'For the cupcakes:',
u'1 stick (113g) butter/marg*',
u'1 cup caster sugar', u'2 eggs',
...
u'1 tbsp vanilla extract',
u'2-3tbsp milk',
u'Sprinkles to decorate, optional'
]
Note that I've also improved your locator a bit and switched to a div.ings p
CSS selector .
Another way:
import requests
from bs4 import BeautifulSoup as bs
url = "http://www.foodily.com/r/0y1ygzt3zf-perfect-vanilla-cupcakes-by-annie-s"
source = requests.get(url)
text_new = source.text
soup = bs(text_new, "html.parser")
ingredients = soup.findAll('div', {"class": "ings"})
for a in ingredients :
print (a.text)
It will print:
For the cupcakes:
1 stick (113g) butter/marg*
1 cup caster sugar
2 eggs
1 tbsp vanilla extract
1 and 1/2 cups plain flour
2 tsp baking powder
1/2 cup milk (I use Skim)
For the frosting:
2 sticks (226g) unsalted butter, at room temp
2 and 1/2 cups icing sugar, sifted
1 tbsp vanilla extract
2-3tbsp milk
Sprinkles to decorate, optional
If you already have the list of p
tags, use get_text()
. This will return only the text of them:
ingredient_list = p.get_text() for p in ingredients
The result array will look like:
ingredient_list = [
'For the cupcakes:', '1 stick (113g) butter/marg*',
'1 cup caster sugar','2 eggs', ...
]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.