简体   繁体   English

多台主机上的 Rsync 并行

[英]Rsync on multiple hosts in parallel

I need to send frequently a lot of files to a multiple hosts and is crucial to be fast and I want it to do it in parallel.我需要经常向多个主机发送大量文件,而且速度至关重要,我希望它并行执行。

how can I run in a bash script a parallel rsync to multiple hosts?如何在 bash 脚本中运行到多个主机的并行 rsync?

now the script looks like this现在脚本看起来像这样

   for i in ${listofhosts[*]}
   do
   rsync -rv --checksum  folder/ -e "ssh -i rsa_key -o 
   StrictHostKeyChecking=no" user@$i:/var/test/folder --delete  || 
   exit 1
   done

LE: I'm thinking of something with GNU Parallel or xargs but I don't know how to use them in this situation LE:我正在考虑使用GNU Parallelxargs,但我不知道在这种情况下如何使用它们

With just a shell script,只需一个shell脚本,

#!/bin/bash
procs=()
for i in "${listofhosts[@]}"; do  # notice syntax fixes
  rsync -rv --checksum  folder/ -e "ssh -i rsa_key -o 
   StrictHostKeyChecking=no" user@$i:/var/test/folder --delete &
  procs+=($!)
done
for proc in "${procs[@]}"; do
  wait "$proc"
done

The obvious drawback is that you can't cancel the others as soon as one of them fails.明显的缺点是您不能在其中一个失败后立即取消其他人。 If you really have "a lot" of hosts, this will probably saturate your network bandwidth to the point where you regret asking about how to do this.如果您真的有“很多”主机,这可能会使您的网络带宽饱和,以至于您后悔询问如何执行此操作。

With xargs , you can limit how many instances you run:使用xargs ,您可以限制运行的实例数量:

# probably better if you have the hosts in a file instead of an array actually,
# and simply run xargs <filename -P 17 -n 1 ...
printf '%s\n' "${listofhosts[@]}" |
xargs -P 17 -n 1 sh -c 'rsync -rv --checksum  folder/ -e "ssh -i rsa_key -o 
   StrictHostKeyChecking=no" user@"$0":/var/test/folder --delete || exit 1'

Perhaps notice how we sneakily smuggle in the host in $0 .也许注意到我们如何偷偷地在$0走私主机。 You could equivalently but slightly less obscurely populate $0 with a dummy string and use $1 , but it doesn't really make a lot of difference here.您可以等效地但稍微不那么模糊地用一个虚拟字符串填充$0并使用$1 ,但这在这里并没有太大区别。

The -P 17 says to run a maximum of 17 processes in parallel (obviously, tweak to your liking), and -n 1 says to only run one instance of the command line at a time. -P 17表示最多并行运行 17 个进程(显然,根据您的喜好进行调整),而-n 1表示一次仅运行一个命令行实例。 xargs still does not offer a way to interrupt the entire batch if one of the processes fails, and only reports back summaric result codes (like, the exit code from xargs will be non-zero if at least one of the processes failed).如果其中一个进程失败, xargs仍然不提供中断整个批处理的方法,并且只报告摘要结果代码(例如,如果至少一个进程失败,则xargs的退出代码将不为零)。

With GNU Parallel it should be something like this:使用GNU Parallel它应该是这样的:

printf '%s\n' "${listofhosts[@]}" | parallel --will-cite --halt now,fail=1 rsync -rv --delete --checksum  -e $(printf '%q' 'ssh -i rsa_key -o StrictHostKeyChecking=no') folder/ user@{}:/var/test/folder

The tricky part is that you have to explicitly escape the arguments containing spaces (and other characters that are in $IFS ).棘手的部分是您必须显式转义包含空格(以及$IFS其他字符)的参数。

Note : You can limit the number of rsync that run in // with the -j option of parallel :注意:您可以使用parallel-j选项限制在 // 中运行的rsync数量:

... | parallel -j 8 --will-cite --halt now,fail=1 rsync -rv ...

With GNU Parallel you would do something like this:使用 GNU Parallel 你会做这样的事情:

doit() {
  i="$1"
  rsync -rv --checksum  folder/ -e "ssh -i rsa_key -o StrictHostKeyChecking=no" user@$i:/var/test/folder --delete || 
    exit 1
}
export -f doit
parallel doit ::: ${listofhosts[@]}

May I suggest spending 20 minutes on reading chapter 1+2 (and possibly chapter 5): https://doi.org/10.5281/zenodo.1146014我建议花 20 分钟阅读第 1+2 章(可能还有第 5 章): https : //doi.org/10.5281/zenodo.1146014

You command line will love you for it.您的命令行会因此而爱您。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM