简体   繁体   中英

How can i extract the words which are starting with "icon" from HTML code using python

I need a python code to extract the selected word using python.

<a class="tel ttel">
<span class="mobilesv icon-hg"></span>
<span class="mobilesv icon-rq"></span>
<span class="mobilesv icon-ba"></span>
<span class="mobilesv icon-rq"></span>
<span class="mobilesv icon-ba"></span>
<span class="mobilesv icon-ikj"></span>
<span class="mobilesv icon-dc"></span>
<span class="mobilesv icon-acb"></span>
<span class="mobilesv icon-lk"></span>
<span class="mobilesv icon-ba"></span>
<span class="mobilesv icon-nm"></span>
<span class="mobilesv icon-ba"></span>
<span class="mobilesv icon-yz"></span>
</a>

I need to extract the words which start with the "icon"

The Output which I required is

icon-hg, icon-rq, icon-ba, icon-rq, icon-ba, icon-ikj, icon-dc, icon-acb, icon-lk, icon-ba, icon-nm, icon-ba, icon-yz

For your specific case you can get it as below, however i recommend using beautiful soup for working with wide problems, remember, Special cases aren't special enough to break the rules.

text = """
<a class="tel ttel">
<span class="mobilesv icon-hg"></span>
<span class="mobilesv icon-rq"></span>
<span class="mobilesv icon-ba"></span>
<span class="mobilesv icon-rq"></span>
<span class="mobilesv icon-ba"></span>
<span class="mobilesv icon-ikj"></span>
<span class="mobilesv icon-dc"></span>
<span class="mobilesv icon-acb"></span>
<span class="mobilesv icon-lk"></span>
<span class="mobilesv icon-ba"></span>
<span class="mobilesv icon-nm"></span>
<span class="mobilesv icon-ba"></span>
<span class="mobilesv icon-yz"></span>
</a>
"""

result = [word.split('"')[0] for word in text.split() if word.startswith('icon')]

print(result)

output:

['icon-hg', 'icon-rq', 'icon-ba', 'icon-rq', 'icon-ba', 'icon-ikj', 'icon-dc', 'icon-acb', 'icon-lk', 'icon-ba', 'icon-nm', 'icon-ba', 'icon-yz']

If you are using BeautifulSoup. This will search string from icon to qoute (").

from bs4 import BeautifulSoup
import re
s = """<a class="tel ttel">
<span class="mobilesv icon-hg"></span>
<span class="mobilesv icon-rq"></span>
<span class="mobilesv icon-ba"></span>
<span class="mobilesv icon-rq"></span>
<span class="mobilesv icon-ba"></span>
<span class="mobilesv icon-ikj"></span>
<span class="mobilesv icon-dc"></span>
<span class="mobilesv icon-acb"></span>
<span class="mobilesv icon-lk"></span>
<span class="mobilesv icon-ba"></span>
<span class="mobilesv icon-nm"></span>
<span class="mobilesv icon-ba"></span>
<span class="mobilesv icon-yz"></span>
</a>"""
soup = BeautifulSoup(s, "html.parser")
for s in soup.findAll("span"):
    s=str(s)
    print(re.search(r'(?=icon-)[^"]*',s).group())

Result:

icon-hg
icon-rq
icon-ba
icon-rq
icon-ba
icon-ikj
icon-dc
icon-acb
icon-lk
icon-ba
icon-nm
icon-ba
icon-yz

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM