Here is the html file I am going to handle:
<span class="pl">Countries:</span> USA <br/>
<span class="pl">Language:</span> English <br/>
And here is my python code:
from bs4 import BeautifulSoup
record=[]
soup=BeautifulSoup(html)
spans=soup.find_all('span')
for span in spans:
record.append(span.text)
What I finally got is:
Countries: Language:
The result miss some important information :"USA" and "English" How can I get the text?
Use the .next_sibling
notation:
soup.find("span", text="Countries:").next_sibling
soup.find("span", text="Language:").next_sibling
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.