简体   繁体   English

使用ipython和lxml进行网络抓取

[英]web scraping using ipython and lxml

i am trying to get the menu items from this website: 我正在尝试从该网站获取menu items

http://new.holachef.com/daily_menus?menu_date=2015-07-06 

using the following code to target the elements inside which the text exists: 使用以下代码定位文本所在的元素:

from urllib2 import urlopen
from lxml.html import fromstring

def get_page(url):
    html = urlopen(url).read()
    dom = fromstring(html)
    dom.make_links_absolute(url)
    return dom

dom = get_page("http://new.holachef.com/daily_menus?menu_date=2015-07-06")
dom.cssselect("#store_item_64419 > ul > li.meal-discription.clearfix > div.col-xs-8 > h2 > a")

however i get an empty output: 但是我得到一个空的输出:

In [9]: dom.cssselect("#store_item_64419 > ul > li.meal-discription.clearfix > div.col-xs-8 > h2 > a")
Out[9]: []

i want to get the text inside that <a> tag. 我想在<a>标记内获取文本。

我认为您的脚本正在运行这种模式,要求用户选择其位置。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM