I'm trying to write bash script in Linux (Debian), that will be used for downloading graphic files from website given by user during start-up. I'm not sure if my code is correct but first problem is when i try to run my script with website eg http://www.bbc.com/
an error shows: http://www.bbc.com/ : invalid identifier
. I even tried a simple website that has only a few JPG files. My next problem is to find out how to download files from .txt file where the images Internet adresses are included.
#!/bin/bash
# $1 - URL $2 - new catalog name
read $1 $2
url=$1
fold=$2
mkdir -p $fold
if [$# -ne 3];
then
echo "Wrong command"
exit -1
fi
curl $url | grep -o -e "<img src=\".*\"+>" > img_list.txt |wc -l img_list.txt | lin=${% *}
baseurl=$(echo $url | grep -o "https?://[a-z.]*"")
curl -s $url | egrep -o "<img src\=[^>]*>" | sed 's/<img src=\"\([^"]*\).*/\1/.*/\1/g' > url_list.txt
sed -i "s|^/|$baseurl/|" url_list.txt
cd $fold;
what can I do next?
For download every image from the webpage I would to use:
mech-dump --absolute --images http://example.com | xargs -n1 curl -O
but this need to be installed the mech-dump command from the WWW::Mechanize
package.
Using the list file
while read -r url folder
do
mkdir -p "$folder" || exit 1
(cd "$folder" && mech-dump --absolute --images "$url" | xargs -n1 curl -O)
done < list.txt
(assuming than no url nor folder containing a space).
an error shows:
http://www.bbc.com/ : invalid identifier
Your use of read
is wrong; change
read $1 $2
url=$1
fold=$2
to
read url fold
or decide to specify the arguments on the command line and omit only read $1 $2
.
Also, each operand in [
]
must be separated from the brackets; change
if [$# -ne 3];
to
if [ -z "$fold" ]
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.