Actually The website has one <p>
but inside it there are two text values, I just want to scrape one of the texts. website HTML as below:
<p class="text-base font-medium text-gray-700 w-1/2" xpath="1">
Great Clips
<br><span class="text-blue-600 font-normal text-sm">Request Info</span>
</p>
On HTML above, there are two text values ("Great Clips" & "Request Info")if we target <p>
. I just want to scrape "Great Clips" not both, how would I do that with bs4
?
You could use .contents
with indexing to extract only the first child:
soup.p.contents[0].strip()
from bs4 import BeautifulSoup
html = '''
<p class="text-base font-medium text-gray-700 w-1/2" xpath="1">
Great Clips
<br><span class="text-blue-600 font-normal text-sm">Request Info</span>
</p>
'''
soup = BeautifulSoup(html)
soup.p.contents[0].strip()
Great Clips
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.