The page 38 of the book Linux 101 Hacks suggests:
cat url-list.txt | xargs wget –c
I usually do:
for i in `cat url-list.txt`
do
wget -c $i
done
Is there some thing, other than length, where the xargs-technique is superior to the old good for-loop-technique in bash?
Added
The C source code seems to have only one fork. In contrast, how many forks have the bash-combo? Please, elaborate on the issue.
xargs
is designed to process multiple inputs for each process it forks. A shell script with a for
loop over its inputs must fork a new process for each input. Avoiding that per-process overhead can give an xargs
solution a significant performance enhancement.
From the Rationale section of a UNIX manpage for xargs
. (Interestingly this section doesn't appear in the OS X BSD version of xargs
, nor in the GNU version.)
The classic application of the xargs utility is in conjunction with the find utility to reduce the number of processes launched by a simplistic use of the find -exec combination. The xargs utility is also used to enforce an upper limit on memory required to launch a process. With this basis in mind, this volume of POSIX.1-2008 selected only the minimal features required.
In your follow-up, you ask how many forks the other version will have. Jim already answered this : one per iteration. How many iterations are there? It's impossible to give an exact number, but easy to answer the general question. How many lines are there in your url-list.txt file?
There are other some other considerations. xargs
requires extra care for filenames with spaces or other no-no characters, and -exec
has an option ( +
), that groups processing into batches. So, not everyone prefers xargs
, and perhaps it's not best for all situations.
See these links:
Also consider:
xargs -I'{}' wget -c '{}' < url-list.txt
but wget provides an even better means for the same:
wget -c -i url-list.txt
With respect to the xargs versus loop consideration, i prefer xargs when the meaning and implementation are relatively "simple" and "clear", otherwise, i use loops.
xargs还允许你有一个巨大的列表,这对于“for”版本是不可能的,因为shell使用的命令行长度有限。
instead of GNU/Parallel i prefer using xargs' built in parallel processing. Add -P to indicate how many forks to perform in parallel. As in...
seq 1 10 | xargs -n 1 -P 3 echo
would use 3 forks on 3 different cores for computation. This is supported by modern GNU Xargs. You will have to verify for yourself if using BSD or Solaris.
根据您的互联网连接,您可能希望使用GNU Parallel http://www.gnu.org/software/parallel/并行运行它。
cat url-list.txt | parallel wget -c
One advantage I can think of is that, if you have lots of files, it could be slightly faster since you don't have as much overhead from starting new processes.
I'm not really a bash expert though, so there could be other reasons it's better (or worse).
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.