简体   繁体   English

气流任务以Bash方式工作,按计划失败

[英]Airflow Task Works in Bash, Fails When Scheduled

I'm running: 我在跑:

Ubuntu 16.04
airflow v1.8.1
python 3.5

Airflow is running in a docker container. 气流在Docker容器中运行。

I've got an airflow dag that is a single task -- a BashOperator that runs a python script: 我有一个气流任务,这是一项任务-一个运行python脚本的BashOperator:

from airflow import DAG
from airflow.operators.bash_operator import BashOperator
from airflow.operators.python_operator import PythonOperator
from airflow.operators.sensors import ExternalTaskSensor
from airflow.operators import DummyOperator
from datetime import date, datetime, timedelta

start_date = date.today() - timedelta(1)

default_args = {
    'owner': 'airflow',
    'depends_on_past': False,
    'start_date': datetime(start_date.year, start_date.month, start_date.day),
    'retries': 0,
    'retry_delay': timedelta(minutes=5)
}

dag = DAG('$MY_DAG_NAME', default_args=default_args, max_active_runs=1, schedule_interval="35 */2 * * *")
dag.catchup = False

t1 = BashOperator(dag=dag,
                  task_id='$TASK_1',
                  bash_command='python /airflow/scripts/$MY_PYTHON_SCRIPT.py')

t1

When I run python /airflow/scripts/$MY_PYTHON_SCRIPT.py at bash, it completes just fine. 当我在bash上运行python /airflow/scripts/$MY_PYTHON_SCRIPT.py时,它就很好了。 I monitor the memory usage with htop and I don't reach more than one third of total Mem -- the script maxes out at about 10% MEM% usage. 我使用htop监视内存使用情况,但我没有达到内存总数的三分之一以上-脚本的MEM%使用率达到了大约10%。

About two thirds of the time that I run this scheduled with airflow, however, I get the following error, seemingly randomly (one third of the time it will run no problem!): 但是,我大约有三分之二的时间按照预定的流程运行此程序,但似乎是随机出现的(以下情况,三分之一没有问题!):

[2018-08-22 07:36:33,979] {helpers.py:233} INFO - Terminating descendant processes of ['/opt/conda/envs/python35/bin/python', '/opt/conda/envs/python35/bin/airflow', 'run', '$MY_DAG_NAME', '$TASK_1', '2018-08-22T12:35:00', '--job_id', '650', '--raw', '-sd', 'DAGS_FOLDER/$MY_PYTHON_SCRIPT.py'] PID: 5200
[2018-08-22 07:36:33,979] {helpers.py:237} INFO - Terminating descendant process ['bash', '/tmp/airflowtmpwsq8ozwo/$TASK_122bzex5b'] PID: 5209
[2018-08-22 07:36:33,984] {helpers.py:195} ERROR - b''
[2018-08-22 07:36:33,984] {helpers.py:196} INFO - Killed process 5209 with signal 15
[2018-08-22 07:36:33,984] {helpers.py:237} INFO - Terminating descendant process ['python', '/airflow/scripts/$MY_PYTHON_SCRIPT.py'] PID: 5210
[2018-08-22 07:36:33,989] {helpers.py:195} ERROR - b''
[2018-08-22 07:36:33,989] {helpers.py:196} INFO - Killed process 5210 with signal 15
[2018-08-22 07:36:33,989] {helpers.py:242} INFO - Waiting up to 60s for processes to exit...
[2018-08-22 07:36:34,327] {base_task_runner.py:98} INFO - Subtask: [2018-08-22 07:36:34,326] {bash_operator.py:105} INFO - Command exited with return code -15
[2018-08-22 07:36:34,335] {models.py:1595} ERROR - Bash command failed
Traceback (most recent call last):
  File "/opt/conda/envs/python35/lib/python3.5/site-packages/airflow/models.py", line 1493, in _run_raw_task
    result = task_copy.execute(context=context)
  File "/opt/conda/envs/python35/lib/python3.5/site-packages/airflow/operators/bash_operator.py", line 109, in execute
    raise AirflowException("Bash command failed")
airflow.exceptions.AirflowException: Bash command failed
[2018-08-22 07:36:34,336] {models.py:1624} INFO - Marking task as FAILED.
[2018-08-22 07:36:34,352] {models.py:1644} ERROR - Bash command failed
[2018-08-22 07:36:34,353] {base_task_runner.py:98} INFO - Subtask: /opt/conda/envs/python35/lib/python3.5/site-packages/airflow/utils/helpers.py:351: DeprecationWarning: Importing DummyOperator directly from <module 'airflow.operators' from '/opt/conda/envs/python35/lib/python3.5/site-packages/airflow/operators/__init__.py'> has been deprecated. Please import from '<module 'airflow.operators' from '/opt/conda/envs/python35/lib/python3.5/site-packages/airflow/operators/__init__.py'>.[operator_module]' instead. Support for direct imports will be dropped entirely in Airflow 2.0.
[2018-08-22 07:36:34,353] {base_task_runner.py:98} INFO - Subtask:   DeprecationWarning)
[2018-08-22 07:36:34,354] {base_task_runner.py:98} INFO - Subtask: Traceback (most recent call last):
[2018-08-22 07:36:34,354] {base_task_runner.py:98} INFO - Subtask:   File "/opt/conda/envs/python35/bin/airflow", line 27, in <module>
[2018-08-22 07:36:34,354] {base_task_runner.py:98} INFO - Subtask:     args.func(args)
[2018-08-22 07:36:34,354] {base_task_runner.py:98} INFO - Subtask:   File "/opt/conda/envs/python35/lib/python3.5/site-packages/airflow/bin/cli.py", line 392, in run
[2018-08-22 07:36:34,354] {base_task_runner.py:98} INFO - Subtask:     pool=args.pool,
[2018-08-22 07:36:34,354] {base_task_runner.py:98} INFO - Subtask:   File "/opt/conda/envs/python35/lib/python3.5/site-packages/airflow/utils/db.py", line 50, in wrapper
[2018-08-22 07:36:34,355] {base_task_runner.py:98} INFO - Subtask:     result = func(*args, **kwargs)
[2018-08-22 07:36:34,355] {base_task_runner.py:98} INFO - Subtask:   File "/opt/conda/envs/python35/lib/python3.5/site-packages/airflow/models.py", line 1493, in _run_raw_task
[2018-08-22 07:36:34,355] {base_task_runner.py:98} INFO - Subtask:     result = task_copy.execute(context=context)
[2018-08-22 07:36:34,355] {base_task_runner.py:98} INFO - Subtask:   File "/opt/conda/envs/python35/lib/python3.5/site-packages/airflow/operators/bash_operator.py", line 109, in execute
[2018-08-22 07:36:34,355] {base_task_runner.py:98} INFO - Subtask:     raise AirflowException("Bash command failed")
[2018-08-22 07:36:34,355] {base_task_runner.py:98} INFO - Subtask: airflow.exceptions.AirflowException: Bash command failed
[2018-08-22 07:36:34,363] {helpers.py:245} INFO - Done waiting

Anyone come across this before or have any debugging tips? 有人以前遇到过这个问题或有任何调试提示吗? It's driving me nuts. 它让我发疯。

Thanks! 谢谢!

Your python script is getting run - but it has a non-zero return code so that means it's getting some kind of exception in that script. 您的python脚本正在运行-但它的返回码为非零,这意味着该脚本中存在某种异常。

The Error b"" suggests some kind of encoding problem with bytes. 错误b“”表示某种字节编码问题。

I'd say the best way to debug this is put some kind of logging in your python code - logging.info("at point a in code"). 我想说,调试此问题的最佳方法是在您的python代码中添加某种日志记录-logging.info(“代码中的at点”)。

That said - you might get better debugging and stack traces if you use a PythonOperator and put your python code in there. 就是说-如果您使用PythonOperator并将python代码放在​​其中,则可能会获得更好的调试和堆栈跟踪。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 os.makedirs在IDLE中运行时有效,在计划任务中失败 - os.makedirs works when run in IDLE, fails in scheduled task 当外部任务失败时,气流外部任务传感器不会失败 - Airflow ExternalTaskSensor don't fail when External Task fails Airflow dag bash任务滞后于远程执行 - Airflow dag bash task lag on remote executions 当任务计划程序中计划的 python 文件无法执行时,有什么方法可以获取电子邮件/whatsapp 通知? - Is there any way to get email/whatsapp notification when a scheduled python file in task scheduler fails to execute? Cloud Composer Airflow 任务失败但功能成功完成 - Cloud Composer Airflow task fails but functions complete successfully python 中 discord 机器人的计划任务无法完成其工作 - A scheduled task to a discord bot in python fails to do its job 如果先前的任务在Apache Airflow中失败,如何运行任务 - How to run a task if a previous one fails in Apache Airflow pycurl失败但curl(来自bash)在ubuntu中有效 - pycurl fails but curl (from bash) works in ubuntu 如何停止预定的气流 - how to stop airflow scheduled dag Windows计划的任务python在计算机锁定时读取文件 - Windows scheduled task python read file when computer locked
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM