简体   繁体   English

批处理任务可以读取文件共享上的文件吗?

[英]Can a batch task read files on a file share?

I have a file share with (you guessed it) a lot of files. 我与很多文件共享(您猜对了)。 I want to create a batch job which mounts this file share and the reads in each of the files and processes each one in parallel (each as a batch task). 我想创建一个批处理作业,该工作装入此文件共享和每个文件中的读取并并行处理每个文件(每个作为批处理任务)。

Is this possible to do with python and in azure batch? 这可能与python和azure批处理有关吗? Any tutorial showing how to do this would be great. 任何显示如何执行此操作的教程都很棒。

You can do this in one of two ways. 您可以通过以下两种方式之一来执行此操作。 Note that the following only applies to Linux. 请注意,以下内容仅适用于Linux。 Windows users will need to follow a slightly different method using User Identities . Windows用户将需要使用User Identities使用稍微不同的方法。

  1. Mount the file share at the compute node level using the pool's StartTask object. 使用池的StartTask对象在计算节点级别挂载文件共享。 Please see the Azure File documentation on how to do this for your distro on Linux. 请参阅Azure File文档 ,了解如何在Linux上进行发行。 The start task can either: 启动任务可以:
    • Mount the file share directly, ie, call mount -t cifs ... . 直接挂载文件共享,即调用mount -t cifs ... This will work through reboots as the StartTask is re-run everytime on reboot. 这将通过重新启动起作用,因为每次重新启动时都会重新运行StartTask。
    • Modify /etc/fstab to add an entry to automount. 修改/etc/fstab以添加自动安装项。 Note that you must make this operation idempotent as the StartTask is re-run everytime on reboot. 请注意,由于每次重新启动时都会重新运行StartTask,因此必须使该操作成为幂等。
  2. Mount the file share at the job level using the job's JobPreparationTask object. 使用作业的JobPreparationTask对象在作业级别挂载文件共享。 The command you specify here will only run once for every task under the job. 您在此处指定的命令仅对作业下的每个任务运行一次。 You should probably also specify the job's JobReleaseTask to unmount the share for cleanup. 您可能还应该指定作业的JobReleaseTaskunmount共享以进行清理。

Make sure, in any path you choose, that proper elevation privileges are given to the task (typically superuser) such that the process can perform the mount or modify /etc/fstab . 确保在您选择的任何路径中,为任务(通常是超级用户)赋予适当的提升特权,以使进程可以执行安装或修改/etc/fstab

If you go with the first option, the mount will be available all the time to the compute node regardless if a job that requires it or not is run on that node. 如果您选择第一个选项,则无论该节点上是否运行需要安装的作业,该坐席将始终对计算节点可用。 There are advantages and disadvantages for each approach. 每种方法都有优点和缺点。 Your requirements, be it compliance, or technical (for example) should help you on which to choose. 您的要求,无论是合规性还是技术性(例如)都应为您提供帮助。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM