[英]save html content into a txt file using python3
I'm tired of searching and trying codes that give repetitive errors, I really hope someone will help me figure this out. 我已经厌倦了搜索和尝试出现重复错误的代码,我真的希望有人能帮助我解决这个问题。 my probleme is so simple I'm trying to save an html code in a txt file using python, here's the code I'm using: 我的问题是如此简单,我正尝试使用python将html代码保存在txt文件中,这是我正在使用的代码:
from urllib.request import urlopen as uReq
url1 = 'http://www.marmiton.org/recettes/menu-de-la-semaine.aspx'
page = uReq(url1).read().decode()
f = open("test.html", "w")
f.write(page)
f.close()
but it's giving me the following error: 但这给了我以下错误:
UnicodeEncodeError: 'charmap' codec can't encode character '\♥' in position 416224: character maps to UnicodeEncodeError:'charmap'编解码器无法对位置416224中的字符'\\ u2665'进行编码:字符映射到
Here is the updated solution: 这是更新的解决方案:
Python 2.x: Python 2.x:
import urllib
url1 = 'http://www.marmiton.org/recettes/menu-de-la-semaine.aspx'
page = urllib.urlopen(url1).read()
f = open("./test1.html", "w")
f.write(page)
f.close()
Python 3.x: Python 3.x:
import urllib.request
import shutil
url1 = 'http://www.marmiton.org/recettes/menu-de-la-semaine.aspx'
page = urllib.request.urlopen(url1)
print(page)
f = open("./test2.html", "wb")
shutil.copyfileobj(page, f)
f.close()
You need to use urllib
to help you achieve this task. 您需要使用urllib
来帮助您完成此任务。
You should try with requests and bs4 (BeautifulSoup) 您应该尝试使用请求和bs4(BeautifulSoup)
from bs4 import BeautifulSoup
import requests
r = requests.get("https://stackoverflow.com/questions/47503845/save-html-content-into-a-txt-file-using-python")
data = r.text
soup = BeautifulSoup(data)
print(soup)
with open ('/tmp/test.html', 'a') as f:
f.write(str(soup))
You mention that by not using the .decode() method gives you A Type Error. 您提到不使用.decode()方法会给您带来类型错误。 Have you try to take the HTML content and pass it to the write() method as a string. 您是否尝试获取HTML内容并将其作为字符串传递给write()方法。 You may find the way to enclose the HTML content with triple quotes, so you pass it as a multiline string. 您可能会找到用三引号将HTML内容括起来的方法,因此将其作为多行字符串传递。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.