简体   繁体   中英

Passing template variables to HiveOperator

I have a jinja template which I plan to use for dynamic SQL generation in Hive. My template look like as follows:

USE {{ db }};

CREATE EXTERNAL TABLE IF NOT EXISTS foo (
    A int,
    B int
)
stored as parquet
location ‘….’;

"db" is something that can be derived by making a function call. I decided to write an operator extending HiveExecOperator. In my environment the class hierarchy is:

BaseOperator <—— BaseExecOperator <— HiveExecOperator

My TestHive operator looks like following:

class TestHive(HiveExecOperator):
    def pre_execute(self, context):
        context[‘db’] = func1(…,,)
        return context['ti'].render_templates()

This one is not working as {{ db }} inside the template doesn't get anything and the hive statement fails. I also tried overriding render_template in TestHive as follows:

class TestHive(HiveExecOperator):
    def render_template(self, attr, content, context):
    context['db'] = func1(..,)
    return super(TestHive, self).render_templates(attr, content, context)

This one fails as the parent class of TestHive doesn't have render_templates method.

Method: render_templates" is only defined in BaseOperator.

Any help is appreciated.

Assuming you mean HiveOperator and not HiveExecOperator, and having a look at what you're describing, I don't believe you should need to derive any kind of operator here. Unless there's some extra missing info which I'm not seeing, you're simply asking how to pass the value of a function call as a parameter into a templated command.

The hql argument of HiveOperator is a template field . That means you should be able to simply define your template as you've done already and then provide the value to it as part of that Operator call. But remember to prefix the variable being passed in with params. See:

my_query= """
    USE {{ params.db }};

    CREATE EXTERNAL TABLE IF NOT EXISTS foo (
    A int,
    B int
    )
    stored as parquet
    location .......
    """

run_hive_query = HiveOperator(
    task_id="my_task",
    hql=my_query,
    params={ 'db': func1(...) },
    dag=dag
)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM