python/ beautifulsoup

I am a rookie in Python. I am trying to count some words or expressions on html files. For example,I have a piece of html with source codes as below:

<div style="line-height:120%;text-align:justify;text-indent:24px;font-size:10.5pt;">
<font style="font-family:inherit;font-size:10.5pt;font-style:italic;font-weight:bold;">2013 vs. 2012&#160;&#160;</font>
<font style="font-family:inherit;font-size:10.5pt;">During 2013, the Company recognized a decommissioning charge of $117 million and a restoration liability of $50 million, partially offset by the 2013 reversal of the $56&#160;million tax indemnification liability associated with the 2006 sale of the Company&#8217;s Canadian subsidiary.</font></div>

I want to count how many times "liability" show up in the piece. Below is my code, which is not working:

import os
from bs4 import BeautifulSoup

lst=os.listdir("C:/html/")
for x in lst:
    print (x)
    html = open ("C:/html/"+x,'rb')
    bsobj = BeautifulSoup(html,"html.parser")
    metricslist = bsobj.findAll(div.string ='liability')
    print(len(metricslist)) 

I know bsobj.findAll(div.string ='liability') is very wrong, but have no idea on what the code should be. Any help will be appreciated!

You can apply a partial string match on an element's text when using find() or find_all() :

soup.find(text=lambda text: text and "liability" in text)

Or, a regular expression pattern can be used in place of a function :

soup.find(text=re.compile(r"\bliability\b")

暂无
暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Using a Text file to count certain words in PYTHON Python count words under certain condition Python get count of words in a file Python: count the list of words except when certain words precede (Python) Check Word Count in a File and then congratulate you if you've hit a certain Number of Words daily Find and write certain words in lines to a file in python Count only the words in a text file Python Python: Count the Total number of words in a file? Trying to count words in a file using Python JSON File: Count the Full Number of Words with Python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM