简体   繁体   中英

Python:extract text after </span> before <br/>

Here is the html file I am going to handle:

<span class="pl">Countries:</span> USA <br/>
<span class="pl">Language:</span> English <br/>

And here is my python code:

from bs4 import BeautifulSoup

record=[]
soup=BeautifulSoup(html)
spans=soup.find_all('span')
for span in spans:
   record.append(span.text)

What I finally got is:

Countries: Language:

The result miss some important information :"USA" and "English" How can I get the text?

Use the .next_sibling notation:

soup.find("span", text="Countries:").next_sibling
soup.find("span", text="Language:").next_sibling

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM