i want to extrect : tamar tamar,0529589055
from this text and i ahve to do that multiple times.
<h3 class="name">tamar tamar</h3>
<ul class="list-inline">
<li>gender:female</li>
<li>age:20</li>
<li class="phone" data="0529589055">phone: 0529589055</li>
<li class="email" data="tamar0529589055@gmail.com">email: tamar89055@gmail.com</li> <!-- <a
did you think about trying to use regex? for example a simple (\\w+ \\w+)</h3>
will extract the name. at least for the example above. for the number something like: (0\\d+)</li>
from the top of my head.
an online regex site that i find easy to use: https://pythex.org
and python regex docs: https://docs.python.org/2/library/re.html
BeautifulSoup is what you are looking for
from bs4 import BeautifulSoup
a='''<h3 class="name">tamar tamar</h3>
<ul class="list-inline">
<li>gender:female</li>
<li>age:20</li>
<li class="phone" data="0529589055">phone: 0529589055</li>
<li class="email" data="tamar0529589055@gmail.com">email: tamar89055@gmail.com</li>
'''
soup = BeautifulSoup(a)
print(soup.find('h3',{"class": "name"}).text)
print(soup.find('li',{"class":'phone'}).text)
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.