简体   繁体   中英

Unable to run airflow scheduler

I have recently installed airflow on an AWS server by using this guide for ubuntu 16.04. After a painful and successful install started the webserver. I tried a sample dag as follows

from airflow.operators.python_operator import PythonOperator
from airflow.operators.dummy_operator import DummyOperator
from datetime import timedelta
from airflow import DAG
import airflow


# DEFAULT ARGS
default_args = {
'owner': 'airflow',
'start_date': airflow.utils.dates.days_ago(2),
'depends_on_past': False}


dag = DAG('init_run', default_args=default_args, description='DAG SAMPLE',
schedule_interval='@daily')


def print_something():
        print("HELLO AIRFLOW!")


with dag:
        task_1 = PythonOperator(task_id='do_it', python_callable=print_something)
        task_2 = DummyOperator(task_id='dummy')

        task_1 << task_2

But when i open the UI the tasks in the dag are still in "No Status" no matter how many times i trigger manually or refresh the page.

Later i found out that airflow scheduler is not running and shows the following error:

{celery_executor.py:228} ERROR - Error sending Celery task:No module named 'MySQLdb'
Celery Task ID: ('init_run', 'dummy', datetime.datetime(2019, 5, 30, 18, 0, 24, 902499, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 1)
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/site-packages/airflow/executors/celery_executor.py", line 118, in send_task_to_executor
    result = task.apply_async(args=[command], queue=queue)
  File "/usr/local/lib/python3.7/site-packages/celery/app/task.py", line 535, in apply_async
    **options
  File "/usr/local/lib/python3.7/site-packages/celery/app/base.py", line 728, in send_task
    amqp.send_task_message(P, name, message, **options)
  File "/usr/local/lib/python3.7/site-packages/celery/app/amqp.py", line 552, in send_task_message
    **properties
  File "/usr/local/lib/python3.7/site-packages/kombu/messaging.py", line 181, in publish
    exchange_name, declare,
  File "/usr/local/lib/python3.7/site-packages/kombu/connection.py", line 510, in _ensured
    return fun(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/kombu/messaging.py", line 194, in _publish
    [maybe_declare(entity) for entity in declare]
  File "/usr/local/lib/python3.7/site-packages/kombu/messaging.py", line 194, in <listcomp>
    [maybe_declare(entity) for entity in declare]
  File "/usr/local/lib/python3.7/site-packages/kombu/messaging.py", line 102, in maybe_declare
    return maybe_declare(entity, self.channel, retry, **retry_policy)
  File "/usr/local/lib/python3.7/site-packages/kombu/common.py", line 121, in maybe_declare
    return _maybe_declare(entity, channel)
  File "/usr/local/lib/python3.7/site-packages/kombu/common.py", line 145, in _maybe_declare
    entity.declare(channel=channel)
  File "/usr/local/lib/python3.7/site-packages/kombu/entity.py", line 608, in declare
    self._create_queue(nowait=nowait, channel=channel)
  File "/usr/local/lib/python3.7/site-packages/kombu/entity.py", line 617, in _create_queue
    self.queue_declare(nowait=nowait, passive=False, channel=channel)
  File "/usr/local/lib/python3.7/site-packages/kombu/entity.py", line 652, in queue_declare
    nowait=nowait,
  File "/usr/local/lib/python3.7/site-packages/kombu/transport/virtual/base.py", line 531, in queue_declare
    self._new_queue(queue, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 82, in _new_queue
    self._get_or_create(queue)
  File "/usr/local/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 70, in _get_or_create
    obj = self.session.query(self.queue_cls) \
  File "/usr/local/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 65, in session
    _, Session = self._open()
  File "/usr/local/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 56, in _open
    engine = self._engine_from_config()
  File "/usr/local/lib/python3.7/site-packages/kombu/transport/sqlalchemy/__init__.py", line 51, in _engine_from_config
    return create_engine(conninfo.hostname, **transport_options)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/__init__.py", line 443, in create_engine
    return strategy.create(*args, **kwargs)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/engine/strategies.py", line 87, in create
    dbapi = dialect_cls.dbapi(**dbapi_args)
  File "/usr/local/lib/python3.7/site-packages/sqlalchemy/dialects/mysql/mysqldb.py", line 104, in dbapi
    return __import__("MySQLdb")
ModuleNotFoundError: No module named 'MySQLdb'

Here is the setting in the config file (airflow.cfg):

sql_alchemy_conn = postgresql+psycopg2://airflow@localhost:5432/airflow
broker_url = sqla+mysql://airflow:airflow@localhost:3306/airflow
result_backend =  db+postgresql://airflow:airflow@localhost/airflow

I been struggling with this issue for two days now, Please help

In your airflow.cfg , there should also be a config option for celery_result_backend . Are you able to let us know what this value is set to? If it is not present in your config, set it to the same value as the result_backend

ie:

celery_result_backend =  db+postgresql://airflow:airflow@localhost/airflow

And then restart the airflow stack to ensure the configuration changes apply.

(I wanted to leave this as a comment but don't have enough rep to do so)

I think the example you are following didnt told you to install mysql and it seems you are using it in broker URL.

you can install mysql and than configure it. (for python 3.5+)

pip install mysqlclient

Alternatively , for a quick fix. You can also use rabbit MQ(Rabbitmq is a message broker, that you will require to rerun airflow dags with celery) guest user login

and then your broker_url will be

broker_url = amqp://guest:guest@localhost:5672//

if not already installed, Rabbitmq can be installed with following command.

sudo apt install rabbitmq-server

Change configuration NODE_IP_ADDRESS=0.0.0.0 in configuration file located at

/etc/rabbitmq/rabbitmq-env.conf

start RabbitMQ service

sudo service rabbitmq-server start

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM