Using python would like to extract context by matching keywords,
Here is my python script
import requests
from bs4 import BeautifulSoup
import re
html = """ <pre>
Companies:
Telstra VI Huawei
Countries:
JPN CHN MLY
</pre>
<pre>
Data center:
US UK
</pre>"""
r = requests.get(html)
soup = BeautifulSoup(r.content, "html.parser")
k = soup.find(text=re.compile("companies:")).parent.text
print (k)
Expected output:
Companies:
Telstra VI Huawei
Try this.
from simplified_scrapy import SimplifiedDoc
html = """ <pre>
Companies:
Telstra VI Huawei
Countries:
JPN CHN MLY
</pre>
<pre>
Data center:
US UK
</pre>"""
doc = SimplifiedDoc(html)
pre = doc.getElementByReg('Companies:')
print(pre.text)
print('-' * 50)
print(pre.replaceReg('Countries:[\s\S]*', '').strip())
Result:
Companies: Telstra VI Huawei Countries: JPN CHN MLY
--------------------------------------------------
Companies:
Telstra VI Huawei
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.