Python-如何两次读取URL的内容？

Question

I am using 'urllib.request.urlopen' to read the content of an HTML page. 我正在使用“ urllib.request.urlopen”来读取HTML页面的内容。 Afterwards, I want to print the content to my local file and then do a certain operation (eg constuct a parser on that page eg BeautifulSoup). 之后，我想将内容打印到本地文件，然后执行某些操作（例如，在该页面上构造一个解析器，例如BeautifulSoup）。

The problem After reading the content for the first time (and writing it into a file), I can't read the content for the second time in order to do something with it (eg construct a parser on it). 问题第一次读取内容（并将其写入文件）后，我无法第二次读取内容以对其进行处理（例如，在其上构造解析器）。 It is just empty and I can't move the cursor( seek(0) ) back to the beginning. 它只是空的，我不能将游标（ seek（0） ）返回到开头。

import urllib.request   


response = urllib.request.urlopen("http://finance.yahoo.com")


file = open( "myTestFile.html", "w")
file.write( response.read()  )    # Tried responce.readlines(), but that did not help me
#Tried: response.seek()           but that did not work
print( response.read() )          # Actually, I want something done here... e.g. construct a parser:
                                  # BeautifulSoup(response).
                                  # Anyway this is an empty result 


file.close()

How can I fix it? 我该如何解决？

Thank you very much! 非常感谢你！

Answer 1

You can not read the response twice. 您无法阅读两次响应。 But you can easily reuse the saved content: 但是您可以轻松地重复使用保存的内容：

content = response.read()
file.write(content)
print(content)

Python-如何两次读取URL的内容？

问题描述

1 个解决方案

解决方案1
7 已采纳 2017-08-22 16:03:20

Python-如何两次读取URL的内容？

问题描述

1 个解决方案

解决方案1 7 已采纳 2017-08-22 16:03:20

解决方案1
7 已采纳 2017-08-22 16:03:20