简体   繁体   中英

Bash script to download graphic files from website

I'm trying to write bash script in Linux (Debian), that will be used for downloading graphic files from website given by user during start-up. I'm not sure if my code is correct but first problem is when i try to run my script with website eg http://www.bbc.com/ an error shows: http://www.bbc.com/ : invalid identifier . I even tried a simple website that has only a few JPG files. My next problem is to find out how to download files from .txt file where the images Internet adresses are included.

#!/bin/bash
# $1 - URL        $2 - new catalog name
read $1 $2
url=$1
fold=$2
mkdir -p $fold

if [$# -ne 3];
then
echo "Wrong command"
exit -1
fi

curl $url | grep -o -e "<img src=\".*\"+>" > img_list.txt |wc -l img_list.txt |  lin=${% *}

baseurl=$(echo $url | grep -o "https?://[a-z.]*"")
curl -s $url | egrep -o "<img src\=[^>]*>" | sed 's/<img src=\"\([^"]*\).*/\1/.*/\1/g' >  url_list.txt

sed -i "s|^/|$baseurl/|" url_list.txt
cd $fold;

what can I do next?

For download every image from the webpage I would to use:

mech-dump --absolute --images http://example.com | xargs -n1 curl -O

but this need to be installed the mech-dump command from the WWW::Mechanize package.

Using the list file

while read -r url folder
do
    mkdir -p "$folder" || exit 1
    (cd "$folder" && mech-dump --absolute --images "$url" | xargs -n1 curl -O)
done < list.txt

(assuming than no url nor folder containing a space).

an error shows: http://www.bbc.com/ : invalid identifier

Your use of read is wrong; change

read $1 $2
url=$1
fold=$2

to

read url fold

or decide to specify the arguments on the command line and omit only read $1 $2 .

Also, each operand in [ ] must be separated from the brackets; change

if [$# -ne 3];

to

if [ -z "$fold" ]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM