繁体   English   中英

HDInsight Oozie:配置作业参数

[英]HDInsight Oozie: Hive Job Parameters

我正在使用Oozie scips自动执行Hive Jobs。
在workflow.xml中,我能够从powershell脚本文件(Oozie作业脚本)中获取值。
在hql文件中的哪里,我无法获取在powershell脚本文件(Oozie作业脚本)中定义的值。

powershell脚本文件:

$hiveScript = "$storageUri/Oozie/input/useooziewf.hql"
#$hiveScript = "$storageUri/Oozie/input/"
$hiveTableName = "log4jlogs"
$hiveDataFolder = "$storageUri"
$hiveOutputFolder = "$storageUri/OozieOutput"
$passwd = ConvertTo-SecureString $clusterPassword -AsPlainText -Force
$creds = New-Object System.Management.Automation.PSCredential ($clusterUsername, $passwd)
Use-AzureHDInsightCluster $clusterName


$OoziePayload =  @"
<?xml version="1.0" encoding="UTF-8"?>
<configuration>

   <property>
       <name>nameNode</name>
       <value>$storageUri</value>
   </property>

   <property>
       <name>jobTracker</name>
       <value>jobtrackerhost:9010</value>
   </property>

   <property>
       <name>queueName</name>
       <value>default</value>
   </property>

   <property>
       <name>oozie.use.system.libpath</name>
       <value>true</value>
   </property>

   <property>
       <name>hiveScript</name>
       <value>$hiveScript</value>
   </property>

   <property>
       <name>hiveTableName</name>
       <value>$hiveTableName</value>
   </property>

   <property>
       <name>hiveDataFolder</name>
       <value>$hiveDataFolder</value>
   </property>

   <property>
       <name>hiveOutputFolder</name>
       <value>$hiveOutputFolder</value>
   </property>

   <property>
       <name>user.name</name>
       <value>admin</value>
   </property>

   <property>
       <name>oozie.wf.application.path</name>
       <value>$oozieWFPath</value>
   </property>

</configuration>
"@

像下面那样开始Oozie的工作:

# create Oozie job
Write-Host "Sending the following Payload to the cluster:" -ForegroundColor Green
Write-Host "`n--------`n$OoziePayload`n--------"
$clusterUriCreateJob = "https://$clusterName.azurehdinsight.net:443/oozie/v1/jobs"
$response = Invoke-RestMethod -Method Post -Uri $clusterUriCreateJob -Credential $creds -Body $OoziePayload -ContentType "application/xml" -OutVariable $OozieJobName #-debug

$jsonResponse = ConvertFrom-Json (ConvertTo-Json -InputObject $response)
$oozieJobId = $jsonResponse[0].("id")
#Write-Host "Oozie job id is $oozieJobId..."

# start Oozie job
Write-Host "Starting the Oozie job $oozieJobId..." -ForegroundColor Green
$clusterUriStartJob = "https://$clusterName.azurehdinsight.net:443/oozie/v1/job/" + $oozieJobId + "?action=start"
$response = Invoke-RestMethod -Method Put -Uri $clusterUriStartJob -Credential $creds | Format-Table -HideTableHeaders #-debug

蜂巢作业(hql文件):

DROP TABLE ${hiveTableName};
CREATE EXTERNAL TABLE ${hiveTableName}(t1 string, t2 string, t3 string, t4 string, t5 string, t6 string, t7 string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' STORED AS TEXTFILE LOCATION '${hiveDataFolder}';

假设$oozieWFPath引用了现有的工作流xml,您是否可以尝试将参数添加到Hive Action中:

<action name="myhiveaction">
    <hive xmlns="uri:oozie:hive-action:0.2">
        <job-tracker>${jobTracker}</job-tracker>
        <name-node>${nameNode}</name-node>
        <script>${hiveScript}</script>
        <param>hiveTableName=${hiveTableName}</param>
        <param>hiveDataFolder=${hiveDataFolder}</param>
    </hive>
    ...
</action>

最后两个参数节点应将Oozie变量传递给Hive脚本。

您可以在http://azure.microsoft.com/en-us/documentation/articles/hdinsight-use-oozie/的“将Oozie与HDInsight一起使用”中找到示例。 可以在http://azure.microsoft.com/zh-cn/documentation/services/hdinsight/中找到更多HDInsight文章。

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM