简体   繁体   中英

How to download latest version of software from same url using wget

I would like to download a latest source code of software (WRF) from some url and automate the installation process thereafter. A sample url like is given below:-

http://www2.mmm.ucar.edu/wrf/src/WRFV3.6.1.TAR.gz

In the above url, the version number may change time to time after the developer release the new version. Now I would like to download the latest available version from the main script. I tried the following:-

wget -k -l 0 "http://www2.mmm.ucar.edu/wrf/src/" -O index.html ; cat index.html | grep -o 'http:[^"]*.gz' | grep 'WRFV'

With above code, I could pull all available version of the software. The output of the above code is below:-

http://www2.mmm.ucar.edu/wrf/src/WRFV2.0.3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.1.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.1.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.2.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV2.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.0.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.0.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.1.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.2.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.3.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.4.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.4.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.5.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.5.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.6.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Chem-3.6.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3-Var-do-not-use.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.0.1.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.0.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.1.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.2.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.2.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.3.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.3.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.4.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.4.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.5.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.5.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.6.1.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.6.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3.TAR.gz
http://www2.mmm.ucar.edu/wrf/src/WRFV3_OVERLAY_3.0.1.1.TAR.gz

However, I am unable to go further to filter out only later version from the link.

关于添加数字排序并取第一行呢?

   wget -k -l 0 "http://www2.mmm.ucar.edu/wrf/src/" -O index.html ; cat index.html | grep -o 'http:[^"]*.gz' | grep 'WRFV[0-9]*[0-9]\.[0-9]'  | sort -r -n | head -1

Usually, for processing the html-pages i recommendig some perl tools, but because this is an Directory Index output, (probably) can be done by bash tools like grep sed and such...

The following code is divided to several smaller bash functions, for easy changes

#!/bin/bash

#getdata - should output html source of the page
getdata() {
    #use wget with output to stdout or curl or fetch
    curl -s "http://www2.mmm.ucar.edu/wrf/src/"
    #cat index.html
}

#filer_rows - get the filename and the date columns
filter_rows() {
    sed -n 's:<tr><td.*href="\([^"]*\)">.*>\([0-9].*\)</td>.*</td>.*</td></tr>:\2@\1:p' | grep "${1:-.}"
}

#sort_by_date - probably don't need comment... sorts the lines by date... ;)
sort_by_date() {
    while IFS=@ read -r date file
    do
        echo "$(date --date="$date" +%s)@$file"
    done | sort -gr
}

#MAIN
file=$(getdata | filter_rows WRFV | sort_by_date | head -1 | cut -d@ -f2)
echo "You want download: $file"

prints

You want download: WRFV3-Chem-3.6.1.TAR.gz

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM