简体   繁体   中英

Crawl links with rel=“nofollow” using wget

I have a site ( http://a-site.com ) with many links like that. How can I use wget to crawl and grep this type of links to a file?

<a href="/user/333333/follow_user" class="btn" rel="nofollow">Follow</a>

I tried this but this command won't get me the links with nofollow.

$ wget --no-verbose -r -l1 http://a-site.com 2>&1

info from here:

http://skeena.net/kb/wget%20ignore%20robots.txt

try:

wget -erobots=off http://your.site.here

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM