简体   繁体   English

Python 和 AWS EMR 步骤:使用 os.system 运行诸如 chmod 之类的命令在作为 EMR 步骤运行时不起作用

[英]Python and AWS EMR Steps: Using os.system to run commands such as chmod not working when ran as an EMR step

My team is working in AWS and we have python scripts that are doing some basic moving of files from the S3 bucket to the EC2 instance.我的团队在 AWS 工作,我们有 Python 脚本,可以将文件从 S3 存储桶移动到 EC2 实例。 I want to preface this with the script we are using works when ran directly from the ec2 instance and is only and issue when ran as an EMR step.我想用我们正在使用的脚本作为前言,当直接从 ec2 实例运行时有效,并且仅在作为 EMR 步骤运行时才发出。 (Attempting to Automate) here are some snippets of the code that works manually but not in a step def. (尝试自动化)这里是一些手动工作但不在步骤定义中的代码片段。

1: create a logger 1:创建一个记录器

import os, sys, boto3
import logging, datetime
import Configuration as cfg

# setup logger for this module
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
formatter = logging.Formatter(cfg.logFormatterStr)
logFileName = os.path.splitext(os.path.basename(__file__))[0] + '_' + \
                 datetime.datetime.now().strftime('%Y%m%d_%H%M%S.log')
file_handler = logging.FileHandler(logFileName)
file_handler.setFormatter(formatter)
logger.addHandler(file_handler)

2: we download the objects. 2:我们下载对象。

for pre in prefixes:
for obj in SB.objects.filter(Prefix=pre):
    if '.' in obj.key:
        temp = obj.key.split('/')
        objList.append((obj.key,temp[-1]))
for item in objList:
    SB.download_file(item[0],os.getenv("HOME") + '/' + item[1])
    logger.info('Downloaded - %s' % item[0])
objList[:] = []

3: Then we are trying to use os.system to perform a chmod command as well as mkdir and mv 3:然后我们尝试使用 os.system 来执行 chmod 命令以及 mkdir 和 mv

os.system('chmod 775 *.py')

# Move HQL files to a subfolder
os.system('mkdir -p hive')
os.system('mv -f *.hql hive')

Step 2 works.第 2 步有效。 the files are downloaded to the ec2 instance.文件被下载到 ec2 实例。 For some reason the log file is never written or created.由于某种原因,日志文件永远不会被写入或创建。 and we get errors for all of the os.system commands.我们得到所有 os.system 命令的错误。

chmod: cannot access ‘*.py’: No such file or directory
mv: cannot stat ‘*.hql’: No such file or directory

(We are pretty sure the unusual characters around *.hql and *.py are some issue with amazon logging the quotations. (我们很确定 *.hql 和 *.py 周围的异常字符是亚马逊记录引用的一些问题。

One of my team members managed to troubleshoot and find the couse of his errors.我的一名团队成员设法排除故障并找到了他的错误原因。 His statement below for other:他的声明如下:

My script was located under /home/hadoop.我的脚本位于 /home/hadoop 下。 When I ran that script as an EMR step (with an argument of the file's location in /home/hadoop), the script was run under a different directory (/mnt/var/lib/hadoop/steps/ { unique-step-ID } ).当我将该脚本作为 EMR 步骤运行时(使用 /home/hadoop 中文件位置的参数),该脚本在不同目录下运行 (/mnt/var/lib/hadoop/steps/ { unique-step-ID })。 Since the script that was run was looking for a file under /home/hadoop, it could not find it and appeared to be a permission issue.由于运行的脚本正在寻找 /home/hadoop 下的文件,因此无法找到它并且似乎是权限问题。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM