[英]Reading an HTML File from Folder in Python
I want to read an HTML file in Python 3.4.3. 我想在Python 3.4.3中读取一个HTML文件。
I have tried: 我努力了:
import urllib.request
fname = r"C:\Python34\html.htm"
HtmlFile = open(fname,'w')
print (HtmlFile)
This prints: 这打印:
<_io.TextIOWrapper name='C:\\Python34\\html.htm' mode='w' encoding='cp1252'>
I want to get the HTML source so that I can parse it with beautiful soup. 我想获取HTML源代码,以便我可以用美丽的汤来解析它。
You will have to read the contents of the file. 您必须阅读该文件的内容。
HtmlFile = open(fname, 'r', encoding='utf-8')
source_code = HtmlFile.read()
I was trying to read the saved HTML file in the folder. 我试图读取文件夹中保存的HTML文件。 I tried code mentioned by Vikasa but was getting an error. 我尝试过Vikasa提到的代码但是收到了错误。 So I changed the code and tried to read it again it worked for me. 所以我改变了代码并尝试再次阅读它对我有用。 The code is as follows: 代码如下:
fname = 'page_source.html' #this html file is stored on the same folder of the code file
html_file = open(fname, 'r')
source_code = html_file.read()
print the html page using 使用打印html页面
source_code
It will print the content read from the page_source.html file. 它将打印从page_source.html文件中读取的内容。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.