I am trying to extract the text Weight: 16.5 pounds
from the following HTML:
<div class="product__description__text">.........
<p dir="ltr"><span><strong>Dimensions:</strong> 39 x 17.3 x 32.2 inches</span></p><p dir="ltr"><span><strong>Weight:</strong> 16.5 pounds</span></p><p dir="ltr"><span><strong>Weight limit:</strong> 35 pounds</span></p><p dir="ltr"><span><strong>Height limit:</strong> 32 inches</span></p></div>
Here's what I've tried so far:
results = soup.find_all('div', attrs={'class':'product'})
Weight_L = []
for result in results:
if result.find('p', attrs={'dir':'ltr'})is not None:
weight = result.span.text
Weight_L.append(weight)
If you are only finding weight
, I would suggest you to only check if the keyword "weight" is in the p
tag. Also, if you use find
, it would only return the first result - so if the first p
tag is not "Weight", you would not be able to find it. Also, if your class name is product__description__text
, you should also change your finding class name to product__description__text
.
results = soup.find_all('div', attrs={'class':'product__description__text'})
Weight_L = []
for result in results:
p_tags = result.find_all('p', attrs={'dir':'ltr'})
for tag in p_tags:
if "Weight:" in tag.text:
weight = tag.text
Weight_L.append(weight)
If the above code you posted is soup
, the result would be: ['Weight: 16.5 pounds']
The Weight: 16.5 pounds
is in second p
tags from parent class .product__description__text
, you can get second p
using p:nth-child(2)
results = soup.select(".product__description__text p:nth-child(2)")
Weight_L = []
for result in results:
Weight_L.append(result.text)
print(result.text)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.