简体   繁体   中英

Boto3/Jenkins client throwing an error while running the code

I am running a daily glue script in one of our AWS machines, which I scheduled it using jenkins.

I am getting the following from the last 15 days. (this daily job is running for almost 6 months and all of a sudden since the 15 days this is happening)

The jenkins console output looks like this

Started by timer
Building in workspace /var/lib/jenkins/workspace/build_name_xyz
[build_name_xyz] $ /bin/sh -xe /tmp/jenkins8188702635955396537.sh
+ /usr/bin/python3 /var/lib/jenkins/path_to_script/glue_crawler.py
Traceback (most recent call last):
  File "/var/lib/jenkins/path_to_script/glue_crawler.py", line 10, in <module>
    response = glue_client.update_crawler(Name = crawler_name,Targets = {'S3Targets': [{'Path':update_path}]})
  File "/usr/local/lib/python3.5/dist-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/usr/local/lib/python3.5/dist-packages/botocore/client.py", line 661, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.InvalidInputException: An error occurred (InvalidInputException) when calling the UpdateCrawler operation: Cannot update Crawler while running. Please stop crawl or wait until it completes to update.
Build step 'Execute shell' marked build as failure
Finished: FAILURE

So, I went ahead and have seen the line 10 in this file

/var/lib/jenkins/path_to_script/glue_crawler.py

That looked something like this.

import boto3
import datetime

glue_client = boto3.client('glue', region_name='region_name')


crawler_name = 'xyz_abc'
today = (datetime.datetime.now()).strftime("%Y_%m_%d")
update_path = 's3://path-to-respective-aws-s3-bucket/%s' % (today)
response = glue_client.update_crawler(Name = crawler_name,Targets = {'S3Targets': [{'Path':update_path}]})
response_crawler = glue_client.start_crawler(
    Name=crawler_name
)
print(response_crawler)

The above throws an error at line 10. I am not understanding what exactly is going wrong on line 10 and hence the jenkins throws an error with the red ball, requesting for some help here. I tried googling on this, but I couldn't find anything.

Just, FYI......if I run the same build (by clicking 'Build Now') using the jenkins UI after sometime, the job runs absolutely fine.

Not sure what exactly is wrong here, any help is highly appreciated.

Thanks in advance!!

The error is self explanatory:

Cannot update Crawler while running. Please stop crawl or wait until it completes to update.

So somehow the crawler was started approximately at the same time and in Glue it's not allowed to update crawler properties when it's running. Please check if there is any other task that starts crawler with name xyz_abc too. Besides that in AWS Console make sure the crawler is configured to run on demand rather than on schedule.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM