[英]how to replace HTML codes in HTML file using python?
I'm trying to replace all HTML codes in my HTML file in a for Loop (not sure if this is the easiest approach) without changing the formatting of the original file.我正在尝试在 for 循环中替换我的 HTML 文件中的所有 HTML 代码(不确定这是否是最简单的方法)而不更改原始文件的格式。 When I run the code below I don't get the codes replaced.
当我运行下面的代码时,我没有替换代码。 Does anyone know what could be wrong?
有谁知道可能出了什么问题?
import re
tex=open('ALICE.per-txt.txt', 'r')
tex=tex.read()
for i in tex:
if i =='õ':
i=='õ'
elif i == 'ç':
i=='ç'
with open('Alice1.replaced.txt', "w") as f:
f.write(tex)
f.close()
You can usehtml.unescape
.您可以使用
html.unescape
。
>>> import html
>>> html.unescape('õ')
'õ'
With your code:使用您的代码:
import html
with open('ALICE.per-txt.txt', 'r') as f:
html_text = f.read()
html_text = html.unescape(html_text)
with open('ALICE.per-txt.txt', 'w') as f:
f.write(html_text)
Please note that I opened the files with a with
statement.请注意,我使用
with
语句打开了文件。 This takes care of closing the file after the with
block - something you forgot to do when reading the file.这负责在
with
块之后关闭文件 - 这是您在读取文件时忘记做的事情。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.