[英]AWS Data Pipeline - Set Hive site values during EMR Creation
我們將數據管道版本從3.3.2升級到5.8,因此舊AMI版本上的那些引導操作已更改為使用configuration
來設置,並在分類/屬性定義下指定它們。
所以我的傑森看起來像下面
{
"enableDebugging": "true",
"taskInstanceBidPrice": "1",
"terminateAfter": "2 Hours",
"name": "ExportCluster",
"taskInstanceType": "m1.xlarge",
"schedule": {
"ref": "Default"
},
"emrLogUri": "s3://emr-script-logs/",
"coreInstanceType": "m1.xlarge",
"coreInstanceCount": "1",
"taskInstanceCount": "4",
"masterInstanceType": "m3.xlarge",
"keyPair": "XXXX",
"applications": ["hadoop","hive", "tez"],
"subnetId": "XXXXX",
"logUri": "s3://pipelinedata/XXX",
"releaseLabel": "emr-5.8.0",
"type": "EmrCluster",
"id": "EmrClusterWithNewEMRVersion",
"configuration": [
{ "ref": "configureEmrHiveSite" }
]
},
{
"myComment": "This object configures hive-site xml.",
"name": "HiveSite Configuration",
"type": "HiveSiteConfiguration",
"id": "configureEmrHiveSite",
"classification": "hive-site",
"property": [
{"ref": "hive-exec-compress-output" }
]
},
{
"myComment": "This object sets a hive-site configuration
property value.",
"name":"hive-exec-compress-output",
"type": "Property",
"id": "hive-exec-compress-output",
"key": "hive.exec.compress.output",
"value": "true"
}
],
"parameters": []
使用上面的Json文件,它被加載到Data Pipeline中,但是拋出錯誤消息
Object:HiveSite Configuration
ERROR: 'HiveSiteConfiguration'
Object:ExportCluster
ERROR: 'configuration' values must be of type 'null'. Found values of type 'null'
我不確定這的真正含義是什么,請您告訴我我是否正確指定了該信息,我認為我是根據http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure- apps.html
下面的塊應名為“ EMR Configuration”,然后才能被AWS Data Pipeline正確識別,並相應地設置Hive-site.xml。
{
"myComment": "This object configures hive-site xml.",
"name": "EMR Configuration",
"type": "EmrConfiguration",
"id": "configureEmrHiveSite",
"classification": "hive-site",
"property": [
{"ref": "hive-exec-compress-output" }
]
},
聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.