简体   繁体   中英

Can't print text inside 'p' tag using BeautifulSoup

Pretty simple code

import requests
from bs4 import BeautifulSoup

link = 'https://www.birdsnest.com.au/brands/boho-bird/73067-amore-wrap-dress'
page = requests.get(link)
soup = BeautifulSoup(page.content, 'html.parser')
page_new = soup.find('div', class_='model-info clearfix')
results = page_new.find_all('p')
for result in results:
    print(result.text)

Output

usually wears a size .
                She is wearing a size  in this style.
              
Her height is .

Show ’s body measurements

The problem is that the model's name is inside <strong> tags and a span inside the <strong> tag. Like so.

 <div class="model-info-header"> <p> <strong><span class="model-info__name">Marnee</span></strong> usually wears a size <strong><span class="model-info__standard-size">8</span></strong>. She is wearing a size <strong><span class="model-info__wears-size">10</span></strong> in this style. </p> <p class="model-info-header__height">Her height is <strong><span class="model-info__height">178 cm</span></strong>.</p> <p> <span class="js-model-info-more model-info__link model-info-header__more">Show <span class="model-info__name">Marnee</span>'s body measurements</span> </p> </div>

How to get the BOLD elements inside the <p> tags.

The model name is generated dynamically. Try this:

from bs4 import BeautifulSoup
from selenium import webdriver
import time

link = 'https://www.birdsnest.com.au/brands/boho-bird/73067-amore-wrap-dress'

driver = webdriver.Chrome()
driver.get(link)
time.sleep(3)

soup = BeautifulSoup(driver.page_source, 'html.parser')
driver.close()
page_new = soup.find('div', class_='model-info clearfix')
results = page_new.find_all('p')
for result in results:
    print(result.text)

Output:

Marnee usually wears a size 8.
                She is wearing a size 10 in this style.
              
Her height is 178 cm.
Show Marnee’s body measurements
Marnee’s body measurements are:
Bust 81 cm
Waist 64 cm
Hips 89 cm

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM