简体   繁体   English

NameError:未定义名称“ htmltext”

[英]NameError: name 'htmltext' is not defined

I obtain an error when i run this script: 运行此脚本时出现错误:

import urllib.request
import urllib.parse
from bs4 import BeautifulSoup

url = "http://nytimes.com,http://nytimes.com"

urls = [url] #stack of urls to scrape
visited = [url] #historic record of urls

while len(urls) >0:
try:
    htmltext = urllib.request.urlopen(urls[0]).read()
except:
    print(htmltext)

Original scipt: 原始密码:

import urllib.request
import urllib.parse
from bs4 import BeautifulSoup

url = "http://nytimes.com,http://nytimes.com"

urls = [url] #stack of urls to scrape
visited = [url] #historic record of urls

while len(urls) >0:
try:
    htmltext = urllib.request.urlopen(urls[0]).read()
except:
    print(urls[0])
soup = BeautifulSoup(htmltext)

urls.pop(0)

print (soup.findAll('a',href=True))

Errors: 错误:

socket.gaierror: [Errno -2] Name or service not known socket.gaierror:[Errno -2]名称或服务未知

urllib.error.URLError: urlopen error [Errno -2] Name or service not known urllib.error.URLError:urlopen错误[Errno -2]名称或服务未知

Traceback (most recent call last): 追溯(最近一次通话):

NameError: name 'htmltext' is not defined NameError:未定义名称“ htmltext”

If urllib.request.urlopen() raises an exception, htmltext never gets assigned a value (so printing that value in except won't work). 如果urllib.request.urlopen()引发异常,则htmltext永远不会被分配一个值(因此, except否则无法打印该值)。

As to why urlopen() is not working, make sure you are passing a valid URL. 至于为什么urlopen()无法正常工作,请确保您传递的是有效的URL。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM