I am in the middle of a software development class and am trying to practice "DRY" principles in all things software dev, so for practice, I want to get wget to download all the files in this (http://fusionplant.com/archive/textfiles/) directory which contain the word "offensive".
Here's an example of one of them: http://fusionplant.com/archive/textfiles/gnu_fortune/gnu_fortune_offensive_astrology
Are there any methods to accomplish this? I imagine they would use regular expressions, but I can't find any sufficiently comparable examples online to get it done.
here's a command I tried to use, it's wrong. Not even close, but here it is:
wget -A '*offensive*.txt' http://fusionplant.com/archive/textfiles/gnu_fortune
It didn't return an error message, but just downloaded the index file
wget -A '*offensive*.txt' http://fusionplant.com/archive/textfiles/gnu_fortune
--2012-06-15 11:15:07-- http://fusionplant.com/archive/textfiles/gnu_fortune
Resolving fusionplant.com... 216.254.119.231
Connecting to fusionplant.com|216.254.119.231|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://fusionplant.com/archive/textfiles/gnu_fortune/ [following]
--2012-06-15 11:15:07-- http://fusionplant.com/archive/textfiles/gnu_fortune/
Reusing existing connection to fusionplant.com:80.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: “gnu_fortune”
[ <=> ] 14,576 50.4K/s in 0.3s
2012-06-15 11:15:08 (50.4 KB/s) - “gnu_fortune” saved [14576]
You can't do it like this. You will have to download the files and then check whether the files contain the string. You can't send a request to the server for it to do this for you.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.