简体   繁体   中英

Redirect wget screen output to a log file in bash

First of all, thank you everyone for your help. I have the following file that contains a series of URL:

Salmonella_enterica_subsp_enterica_Typhi    https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/003/717/755/GCF_003717755.1_ASM371775v1/GCF_003717755.1_ASM371775v1_translated_cds.faa.gz
Salmonella_enterica_subsp_enterica_Paratyphi_A  https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/818/115/GCF_000818115.1_ASM81811v1/GCF_000818115.1_ASM81811v1_translated_cds.faa.gz
Salmonella_enterica_subsp_enterica_Paratyphi_B  https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/018/705/GCF_000018705.1_ASM1870v1/GCF_000018705.1_ASM1870v1_translated_cds.faa.gz
Salmonella_enterica_subsp_enterica_Infantis https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/011/182/555/GCA_011182555.2_ASM1118255v2/GCA_011182555.2_ASM1118255v2_translated_cds.faa.gz
Salmonella_enterica_subsp_enterica_Typhimurium_LT2  https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/000/006/945/GCF_000006945.2_ASM694v2/GCF_000006945.2_ASM694v2_translated_cds.faa.gz
Salmonella_enterica_subsp_diarizonae    https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/003/324/755/GCF_003324755.1_ASM332475v1/GCF_003324755.1_ASM332475v1_translated_cds.faa.gz
Salmonella_enterica_subsp_arizonae  https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/635/675/GCA_900635675.1_31885_G02/GCA_900635675.1_31885_G02_translated_cds.faa.gz
Salmonella_bongori  https://ftp.ncbi.nlm.nih.gov/genomes/all/GCF/006/113/225/GCF_006113225.1_ASM611322v2/GCF_006113225.1_ASM611322v2_translated_cds.faa.gz

And I have to download the url using wget I have already achieve to download the URL but the typicall output in shell appears:

--2021-04-23 02:49:00--  https://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/900/635/675/GCA_900635675.1_31885_G02/GCA_900635675.1_31885_G02_translated_cds.faa.gz
Reusing existing connection to ftp.ncbi.nlm.nih.gov:443.
HTTP request sent, awaiting response... 200 OK
Length: 1097880 (1,0M) [application/x-gzip]
Saving to: ‘GCA_900635675.1_31885_G02_translated_cds.faa.gz’

GCA_900635675.1_31885_G0 100%[=================================>]   1,05M  2,29MB/s    in 0,5s    

2021-04-23 02:49:01 (2,29 MB/s) - ‘GCA_900635675.1_31885_G02_translated_cds.faa.gz’ saved [1097880/1097880]

I want to redirect that output to a log file . Also as the files download, I want to decompress them, because they are zip in.gz. My code is the following

cat $ncbi_urls_file | while read line
do
    echo " Downloading fasta files from NCBI..."
    awk '{print $2}' | wget -i- 
done

wget

wget does have options allowing logging to files, from man wget

Logging and Input File Options

-o logfile
--output-file=logfile
    Log all messages to logfile. The messages are normally reported to standard error. 
-a logfile
--append-output=logfile
    Append to logfile. This is the same as -o, only it appends to logfile instead of overwriting the old log file. If logfile does not exist, a new file is created. 
-d
--debug
    Turn on debug output, meaning various information important to the developers of Wget if it does not work properly. Your system administrator may have chosen to compile Wget without debug support, in which case -d will not work. Please note that compiling with debug support is always safe---Wget compiled with the debug support will not print any debug info unless requested with -d. 
-q
--quiet
    Turn off Wget's output. 
-v
--verbose
    Turn on verbose output, with all the available data. The default output is verbose. 
-nv
--no-verbose
    Turn off verbose without being completely quiet (use -q for that), which means that error messages and basic information still get printed.

You would need to experiment to got what you need, if you need all logs in single file use -a log.out , which will cause wget to append logging information to said file and not writing to stderr .

Standard output can be redirected to a file in bash using the >> operator (for appending to the file) or the > operator (for truncating / overwriting the file). eg

echo hello >> log.txt

will append "hello" to log.txt. If you still want to be able to see the output in your terminal and also write it to a log file, you can use tee :

echo hello | tee.txt

However, wget outputs most of its basic progress information through standard error rather than standard output. This is actually a very common practice. Displaying progress information often involves special characters to overwrite lines (eg to update a progress bar), change terminal colors, etc. Terminals can process these characters sensibly in real time, but it often does not make much sense to store them in a file. For this reason, such kinds of incremental progress output are often separated from other output which is more sensible to store in a log file to make them easier to redirect accordingly, and hence incremental progress information is often output through standard error rather than standard output.

However, you can still redirect standard error to a log file:

wget example.com 2>> log.txt

Or using tee :

wget example.com 2>&1 | tee log.txt

( 2>&1 redirects standard error through standard output, which is then piped to tee ).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM