I have a URL http://example.com/dir
that has many subdirectories with files that I want to save. Because its size is very big I want to break this operation in parts
eg. download everything from subdirectories starting with A like
http://example.com/A
http://example.com/Aa
http://example.com/Ab
etc
I have created the following script
#!/bin/bash
for g in A B C
do wget -e robots=off -r -nc -np -R "index.html*" http://example.com/$g
done
but it tries to download only http://example.com/A
and not http://example.com/A*
Look at this page, it has all you need to know:
https://www.gnu.org/software/wget/manual/wget.html
1) You could use:
--spider -nd -r -o outputfile <domain>
which does not download the files, it just checks if they are there. -nd
prevents wget from creating directories locally -r
to parse entire site -o outputfile
to send the output to a file
to get a list of URLs to download.
2) then parse the outputfile to extract the files, and create smaller lists of links you want to download.
3) then use -i file
(== --input-file=file
) to download each list, thus limiting how many you download in one execution of wget
.
Notes: - --limit-rate=amount
can be used to slow down downloads, to spare your Internet link!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.