簡體   English   中英

AWS數據管道-在EMR創建期間設置Hive站點值

[英]AWS Data Pipeline - Set Hive site values during EMR Creation

我們將數據管道版本從3.3.2升級到5.8,因此舊AMI版本上的那些引導操作已更改為使用configuration來設置,並在分類/屬性定義下指定它們。

所以我的傑森看起來像下面

  {
            "enableDebugging": "true",
            "taskInstanceBidPrice": "1",
            "terminateAfter": "2 Hours",
            "name": "ExportCluster",
            "taskInstanceType": "m1.xlarge",
            "schedule": {
                "ref": "Default"
            },
            "emrLogUri": "s3://emr-script-logs/",
            "coreInstanceType": "m1.xlarge",
            "coreInstanceCount": "1",
            "taskInstanceCount": "4",
            "masterInstanceType": "m3.xlarge",
            "keyPair": "XXXX",
            "applications": ["hadoop","hive", "tez"],
            "subnetId": "XXXXX",
            "logUri": "s3://pipelinedata/XXX",
            "releaseLabel": "emr-5.8.0",
            "type": "EmrCluster",
            "id": "EmrClusterWithNewEMRVersion",
            "configuration": [
                { "ref": "configureEmrHiveSite" }
            ]
        },
        {
            "myComment": "This object configures hive-site xml.",
            "name": "HiveSite Configuration",
            "type": "HiveSiteConfiguration",
            "id": "configureEmrHiveSite",
            "classification": "hive-site",
            "property": [
                {"ref": "hive-exec-compress-output" }
            ]
        },
        {
            "myComment": "This object sets a hive-site configuration 
             property value.",
            "name":"hive-exec-compress-output",
            "type": "Property",
            "id": "hive-exec-compress-output",
            "key": "hive.exec.compress.output",
            "value": "true"
        }
    ],
    "parameters": []

使用上面的Json文件,它被加載到Data Pipeline中,但是拋出錯誤消息

Object:HiveSite Configuration
ERROR: 'HiveSiteConfiguration'
Object:ExportCluster
ERROR: 'configuration' values must be of type 'null'. Found values of type 'null'

我不確定這的真正含義是什么,請您告訴我我是否正確指定了該信息,我認為我是根據http://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-configure- apps.html

下面的塊應名為“ EMR Configuration”,然后才能被AWS Data Pipeline正確識別,並相應地設置Hive-site.xml。

   {
        "myComment": "This object configures hive-site xml.",
        "name": "EMR Configuration",
        "type": "EmrConfiguration",
        "id": "configureEmrHiveSite",
        "classification": "hive-site",
        "property": [
            {"ref": "hive-exec-compress-output" }
        ]
    },

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM