简体   繁体   English

在机械化python中找到链接

[英]find links in mechanize python

I'm currently using mechanize to auto navigate some link from a website. 我目前正在使用机械化功能自动导航网站上的某些链接。 The problem is mechanize cannot find all link in my website. 问题是机械化找不到我网站上的所有链接。 When i use: 当我使用时:

>>> import mechanize
>>> web = mechanize.Browser()
>>> r = web.open('http://torrent.ajee.sh/hash.php?hash=ee59bf932540976857c38eee56e2a598154a9963')
>>> print r.read()

it actually print out this: 它实际上打印出此:

Adulterers.2015.HDRip.XviD.AC3-EVO (1.38 GB)<br/>Files: <br/>O <a href='/file.php?hash=ee59bf932540976857c38eee56e2a598154a9963&file=0' target='_top' >Adulterers.2015.HDRip.XviD.AC3-EVO.avi </a> (1.4 GB) -- (<font color=''>100% </font> Cached)<br/>O <a href='/file.php?hash=ee59bf932540976857c38eee56e2a598154a9963&file=1' target='_top' >Adulterers.2015.HDRip.XviD.AC3-EVO.nfo </a> (2 KB) -- (<font color=''>100% </font> Cached)<br/>O <a href='/file.php?hash=ee59bf932540976857c38eee56e2a598154a9963&file=3' target='_top' >sample.avi </a> (16.1 MB) -- (<font color=''>100% </font> Cached)<br/>O <a href='/file.php?hash=ee59bf932540976857c38eee56e2a598154a9963&file=2' target='_top' >Torrent Downloaded From ExtraTorrent.cc.txt </a> (338 B ) -- (<font color=''>100% </font> Cached)<br/><br/><br/><a href='/file.php?hash=ee59bf932540976857c38eee56e2a598154a9963&file=zip' target='_top'>Download Zip</a> (1.38 GB)<br/><br/> If Download links not working, then Try again after few mints, <b>Files are been Cached(100%)</b>.<br/><br/>

and there are totally 5 links in it! 总共有5个链接!

but when i use: 但是当我使用时:

print list(web.links())

it just contains the first link in that source! 它仅包含该来源中的第一个链接! what is the problem? 问题是什么?

[Link(base_url='http://torrent.ajee.sh/hash.php?hash=ee59bf932540976857c38eee56e2a598154a9963', url='/file.php?hash=ee59bf932540976857c38eee56e2a598154a9963&file=0', text='Adulterers.2015.HDRip.XviD.AC3-EVO.avi', tag='a', attrs=[('href', '/file.php?hash=ee59bf932540976857c38eee56e2a598154a9963&file=0'), ('target', '_top')])]

sorry about my english! 对不起我的英语!

You have to iterate over the links: 您必须遍历链接:

for link in web.links():
    print(link.text, link.url)

links() is a generator and gives you the next link each time you ask for another link. links()是一个生成器,每次您请求另一个链接时都会为您提供下一个链接。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM