简体   繁体   English

Ansible:将一个唯一的文件复制到组中的每个服务器

[英]Ansible: copying one unique file to each server in a group

I have a series of numbered files to be processed separately by each server. 我有一系列的编号文件,每个服务器分别进行处理。 Each split file made using linux split and then xz compressed to save transfer time. 使用linux split制作的每个split文件,然后进行xz压缩以节省传输时间。

split_001 split_002 split_003 ... split_030

How can I push these files out to a group of 30 servers with ansible? 我如何将这些文件推送到30台ansible服务器中? It does not matter which server gets which file so long as they each have a single unique file. 只要每个服务器都有一个唯一的文件,哪个服务器获取哪个文件都没有关系。

I had used a bash file but I am looking for a better solution. 我曾经使用过bash文件,但是我正在寻找更好的解决方案。 Hopefully using ansible. 希望使用ansible。 Then I plan to run a shell command to run an at command to start the several hours or days of computation. 然后,我计划运行一个shell命令来运行一个at命令,以开始数小时或数天的计算。

scp -oStrictHostKeyChecking=no bt_5869_001.xz usr13@<ip>:/data/
scp -oStrictHostKeyChecking=no bt_5869_002.xz usr13@<ip>:/data/
scp -oStrictHostKeyChecking=no bt_5869_003.xz usr13@<ip>:/data/
...

http://docs.ansible.com/ansible/copy_module.html http://docs.ansible.com/ansible/copy_module.html

# copy file but iterate through each of the split files
- copy: src=/mine/split_001.xz dest=/data/split_001.xz
- copy: src=/mine/compute dest=/data/ owner=root mode=0755
- copy: src=/mine/start.sh dest=/data/ owner=root mode=0755
- shell: xz -d *.xz
- shell: at -f /data/start.sh now

For example: 例如:

  tasks:
    - set_fact:
        padded_host_index: "{{ '{0:03d}'.format(play_hosts.index(inventory_hostname)) }}"

    - copy: src=/mine/split_{{ padded_host_index }}.xz dest=/data/

You can do this with Ansible. 您可以使用Ansible进行此操作。 However, this seems like the wrong general approach to me. 但是,这似乎对我来说是错误的一般方法。

You have a number of jobs. 你有很多工作。 You need them each to be processed, and you don't care which server processes which job as long as they only process each job once (and ideally do the whole batch as efficiently as possible). 您需要对它们进行处理,并且只要它们仅处理一次任务(理想情况下尽可能高效地完成整个批处理),就不必关心哪个服务器处理哪个任务。 This is precisely the situation a distributed queueing system is designed to work in. 这正是分布式排队系统设计要工作的情况。

You'll have workers running on each server and one master node (which may run on one of the servers) that knows about all of the workers. 您将在每台服务器上运行工作程序,并在一个知道所有工作程序的主节点(可能在其中一台服务器上运行)上运行。 When you need to add tasks to get done, you queue them up with the master, and the master distributes them out to workers as they become available - so you don't have to worry about having an equal number of servers as jobs. 当您需要添加任务以完成任务时,您可以将它们与主服务器排队,然后主服务器将它们分发给工作人员,从而使它们分发给工作人员-因此,您不必担心服务器与工作数量相等。

There are many, many options for this, including beanstalkd , Celery , Gearman , and SQS . 对此有很多选择,包括beantalkdCeleryGearmanSQS You'll have to do the legwork to find out which one works best for your situation. 您必须做些腿法来找出哪种方法最适合您的情况。 But this is definitely the architecture best suited to your problem. 但这绝对是最适合您问题的体系结构。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM