简体   繁体   中英

How to scrape just one text value on one p tag from bs4

Actually The website has one <p> but inside it there are two text values, I just want to scrape one of the texts. website HTML as below:

<p class="text-base font-medium text-gray-700 w-1/2" xpath="1">
                        Great Clips

                                                    <br><span class="text-blue-600 font-normal text-sm">Request Info</span>
                                            </p>

On HTML above, there are two text values ("Great Clips" & "Request Info")if we target <p> . I just want to scrape "Great Clips" not both, how would I do that with bs4 ?

You could use .contents with indexing to extract only the first child:

soup.p.contents[0].strip()

Example

from bs4 import BeautifulSoup

html = '''
<p class="text-base font-medium text-gray-700 w-1/2" xpath="1">
                        Great Clips

                                                    <br><span class="text-blue-600 font-normal text-sm">Request Info</span>
                                            </p>
'''
soup = BeautifulSoup(html)

soup.p.contents[0].strip()

Output

Great Clips

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2025 STACKOOM.COM