[英]Using urllib and BeautifulSoup to retrieve info from web with Python
I can get the html page using urllib, and use BeautifulSoup to parse the html page, and it looks like that I have to generate file to be read from BeautifulSoup.我可以使用urllib获取html页面,并使用BeautifulSoup解析html页面,看起来我必须生成要从BeautifulSoup读取的文件。
import urllib
sock = urllib.urlopen("http://SOMEWHERE")
htmlSource = sock.read()
sock.close()
--> write to file
Is there a way to call BeautifulSoup without generating file from urllib?有没有办法在不从 urllib 生成文件的情况下调用 BeautifulSoup?
from BeautifulSoup import BeautifulSoup
soup = BeautifulSoup(htmlSource)
No file writing needed: Just pass in the HTML string.无需写入文件:只需传入 HTML 字符串即可。 You can also pass the object returned from urlopen
directly:也可以直接传递urlopen
返回的对象:
f = urllib.urlopen("http://SOMEWHERE")
soup = BeautifulSoup(f)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.