简体   繁体   中英

How to parse the line that has a specific word in it?

I'm letting my python code go through an HTML document and while it does, I need it to find a specific words and then, parse the lines that have the following words

For example

if HTML document looks like this

htmlDocument = '''
word 023-213103-2402131025901238923213

bla bla bla

bla bla bla 

word 2512-521-096-07464325

bla bla bla 

bla bla bla 

word 123123-0293231
'''

I need my desirableList to look like this after parsing

desirableList = [
"word 023-213103-2402131025901238923213",
"word 2512-521-096-07464325",
"word 123123-0293231"
] 

Here's one way:

>>> desirableList  = [s for s in htmlDocument.split("\n") if "word" in s]
>>> desirableList
['word 023-213103-2402131025901238923213', 'word 2512-521-096-07464325', 'word 123123-0293231']

Update the conditional, as needed, to get other kinds of results like "line starts with":

[s for s in htmlDocument.split("\n") if s.startswith("word")]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM