[英]Parse html files in a directory and check if they are badly formed in Python
我希望編寫一個將通過目錄的腳本,並檢查html文件是否格式錯誤。 請看我的代碼
directory = "html"
for root, dirs, files in os.walk(directory):
for file in files:
if str(file).endswith('.html'):
#Help needed here
if file is badly formed:
print "Badly Formed"
else:
print "Well Formed"
import xml.etree.ElementTree as ETree
....
try:
self.doc = ETree.parse( file )
# do stuff with it ...
except ETree.ParseError :
print( "ERROR in {0} : {1}".format( ETree.ParseError.filename, ETree.ParseError.msg ) )
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.