简体   繁体   中英

wget, download linked files with specific ending

Want to download all wordlists from this site https://wiki.skullsecurity.org/Passwords

I tried

wget https://wiki.skullsecurity.org/Passwords --no-check-certificate --accept "*.bz2" -r

but does only download the targeted page..

What didn't work either:

wget https://downloads.skullsecurity.org/passwords/ --no-check-certificate -m

(tried different combinations of -m and -r)

Tried also with --user-agent to prevent wget preventing from downloading

Tried -l 3 , still no success..facepalm

This works for me:

 wget -e robots=off -r -np -nH --accept "*.bz2"  http://downloads.skullsecurity.org/passwords/

Read about Robot Exclusion

If you know what you are doing and really really wish to turn off the robot exclusion, set the robots variable to 'off'

Site http://downloads.skullsecurity.org/ contains robot.txt with content

User-agent: *
Disallow: /

Explanation

The Disallow: / tells the robot that it should not visit any pages on the site.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM