简体   繁体   中英

Python and AWS EMR Steps: Using os.system to run commands such as chmod not working when ran as an EMR step

My team is working in AWS and we have python scripts that are doing some basic moving of files from the S3 bucket to the EC2 instance. I want to preface this with the script we are using works when ran directly from the ec2 instance and is only and issue when ran as an EMR step. (Attempting to Automate) here are some snippets of the code that works manually but not in a step def.

1: create a logger

import os, sys, boto3
import logging, datetime
import Configuration as cfg

# setup logger for this module
logger = logging.getLogger(__name__)
logger.setLevel(logging.INFO)
formatter = logging.Formatter(cfg.logFormatterStr)
logFileName = os.path.splitext(os.path.basename(__file__))[0] + '_' + \
                 datetime.datetime.now().strftime('%Y%m%d_%H%M%S.log')
file_handler = logging.FileHandler(logFileName)
file_handler.setFormatter(formatter)
logger.addHandler(file_handler)

2: we download the objects.

for pre in prefixes:
for obj in SB.objects.filter(Prefix=pre):
    if '.' in obj.key:
        temp = obj.key.split('/')
        objList.append((obj.key,temp[-1]))
for item in objList:
    SB.download_file(item[0],os.getenv("HOME") + '/' + item[1])
    logger.info('Downloaded - %s' % item[0])
objList[:] = []

3: Then we are trying to use os.system to perform a chmod command as well as mkdir and mv

os.system('chmod 775 *.py')

# Move HQL files to a subfolder
os.system('mkdir -p hive')
os.system('mv -f *.hql hive')

Step 2 works. the files are downloaded to the ec2 instance. For some reason the log file is never written or created. and we get errors for all of the os.system commands.

chmod: cannot access ‘*.py’: No such file or directory
mv: cannot stat ‘*.hql’: No such file or directory

(We are pretty sure the unusual characters around *.hql and *.py are some issue with amazon logging the quotations.

One of my team members managed to troubleshoot and find the couse of his errors. His statement below for other:

My script was located under /home/hadoop. When I ran that script as an EMR step (with an argument of the file's location in /home/hadoop), the script was run under a different directory (/mnt/var/lib/hadoop/steps/ { unique-step-ID } ). Since the script that was run was looking for a file under /home/hadoop, it could not find it and appeared to be a permission issue.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM