简体   繁体   English

使用“>”通过oozie保存配置单元输出

[英]Saving hive output through oozie using “>”

Is something like this possible in oozie? 在oozie中可能会发生这种情况吗?

hive -f hiveScript.hql > output.txt

I have the following oozie hive action for the above code as follows: 对于上面的代码,我有以下oozie hive操作,如下所示:

    <hive xmlns="uri:oozie:hive-action:0.1">
                <job-tracker>${jobTracker}</job-tracker>
                <name-node>${nameNode}</name-node>
                <configuration>
                    <property>
                        <name>mapred.job.queue.name</name>
                        <value>${queueName}</value>
                    </property>
                </configuration>        
               <script>hiveScript.hql</script>  
            </hive>
            <ok to="end" />
            <error to="kill" /> 
    </hive>

How can I tell the script where the output should go? 我如何告诉脚本输出应该到哪里?

That is not possible with Oozie in the way that you want. 对于Oozie而言,这是不可能的。 This is because Oozie starts (most) of it's workflow actions on nodes within the cluster. 这是因为Oozie在群集中的节点上(大部分)启动了工作流操作。

With this you could run the Oozie Shell action to run hive -f hiveScript.hql > output.txt ... however this has different implications of requiring Hive to be installed everywhere, your hiveScript.hql to be everywhere, etc. Another way this doesn't quite work is your output file would be on whichever node was assigned to run this shell action. 这样,您可以运行Oozie Shell操作来运行hive -f hiveScript.hql > output.txt ...但是,这意味着需要将Hive安装在任何地方,将hiveScript.hql安装在任何地方,等等。这是另一种含义。效果不是很好,因为您的输出文件将位于分配给该节点的任何节点上,以运行此Shell动作。 https://oozie.apache.org/docs/3.3.0/DG_ShellActionExtension.html https://oozie.apache.org/docs/3.3.0/DG_ShellActionExtension.html

I think you best bet would be to include INSERT OVERWRITE DIRECTORY '/tmp/hdfs_out' SELECT * FROM ... in your hiveScript.hql file and pulling the results down from HDFS afterwards. 我认为您最好的选择是在hiveScript.hql文件中包含INSERT OVERWRITE DIRECTORY '/tmp/hdfs_out' SELECT * FROM ... ,然后将结果从HDFS中提取。

Edit: Another option I just thought of would be to use the SSH Action. 编辑:我刚刚想到的另一个选项是使用SSH操作。 https://oozie.apache.org/docs/3.2.0-incubating/DG_SshActionExtension.html You could potentially have the SSH Action shell to your target machine and run hive -f hiveScript.hql > output.txt . https://oozie.apache.org/docs/3.2.0-incubating/DG_SshActionExtension.html您可能将SSH Action Shell置于目标计算机上并运行hive -f hiveScript.hql > output.txt

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM