[英]python web scraper - what have I done wrong?
我是新手,正在构建一个网络抓取工具,它将抓取(并最终导出到csv)所有英国麦当劳的地址,邮政编码和电话号码。 我使用的是汇总器,而不是麦当劳网站。
https://www.localstore.co.uk/stores/75639/mcdonalds-restaurant/
我已经借用了一些代码,并重新设计了它们的用途:
from bs4 import BeautifulSoup
from urllib2 import urlopen
BASE_URL = "https://www.localstore.co.uk/stores/75639/mcdonalds-restaurant/"
def get_category_links(section_url):
html = urlopen(section_url).read()
soup = BeautifulSoup(html, "lxml")
boccat = soup.find("tr")
category_links = [BASE_URL + tr.a["href"] for tr in boccat.findAll("h2")]
return category_links
def get_restaurant_details(category_url):
html = urlopen(category_url).read()
soup = BeautifulSoup(html, "lxml")
streetAddress = soup.find("span", "streetAddress").string
addressLocality = [h2.string for h2 in soup.findAll("span", "addressLocality")]
addressRegion = [h2.string for h2 in soup.findAll("span", "addressRegion")]
postalCode = [h2.string for h2 in soup.findAll("span", "postalCode")]
phoneNumber = [h2.string for h2 in soup.findAll("td", "b")]
return {"streetAddress": streetAddress,
"addressLocality": addressLocality,
"postalCode": postalCode,
"addressRegion": addressRegion,
"phoneNumber": phoneNumber}
我认为我没有抓取数据-就像我运行以下行时一样:
print(postalCode)
要么
print(addressLocality)
我收到以下错误
NameError: name 'postalCode' is not defined
我做错了什么主意吗?
正如其他人所评论的那样,您实际上需要首先调用函数。
做这样的事情
if __name__ == '__main__':
res = "https://www.localstore.co.uk/store/329213/mcdonalds-restaurant/london/"
print(get_restaurant_details(res)["postalCode"])
您的两个功能之后。 我刚访问该网站,并获得了一个适用于您程序的URL,但是我从未真正对其进行过测试。 您现在遇到的主要问题是您实际上没有做任何事情。 您需要调用一个函数!
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.