简体   繁体   English

具有多个输入的AWS数据管道活动

[英]AWS data pipeline activity with multiple inputs

As part of an Amazon AWS data pipeline, I have a hive activity using two unstaged S3 data nodes as input. 作为Amazon AWS数据管道的一部分,我有一个配置活动,使用两个未暂存的S3数据节点作为输入。 What I want is to be able to set two script variables on the activity, each pointing to an input data node, but I can't get the syntax right. 我想要的是能够在活动上设置两个脚本变量,每个变量都指向一个输入数据节点,但是我无法正确使用语法。 With the single input, I could write the following and it would work just fine: 使用单一输入,我可以编写以下内容,并且可以正常工作:

INPUT_FOO=#{input.directoryPath}

When I add the second input, I run into a problem of how to reference them since they are now an array of inputs, as you can see in the pipeline definition below. 添加第二个输入时,由于它们现在是输入数组,因此遇到了如何引用它们的问题,如下面的管道定义所示。 Essentially, I want to achieve the following, but can't figure out the correct syntax: 本质上,我想实现以下目标,但无法弄清楚正确的语法:

INPUT_FOO=#{input[1].directoryPath}
INPUT_BAR=#{input[2].directoryPath}

Here's the activity portion of the pipeline definition: 这是管道定义的活动部分:

{
  "id": "ActivityId_7u1sR",
  "input": [
    {
      "ref": "DataNodeId_iYnxf"
    },
    {
      "ref": "DataNodeId_162Ka"
    }
  ],
  "schedule": {
    "ref": "DefaultSchedule"
  },
  "scriptUri": "#{myS3ScriptLocation}calculate-results.q",
  "name": "Perform Calculations",
  "runsOn": {
    "ref": "EmrClusterId_jHeiV"
  },
  "scriptVariable": [
    "INPUT_SOURCE1=#{input[1].directoryPath}",
    "OUTPUT=#{output.directoryPath}Results/",
    "INPUT_SOURCE2=#{input[2].directoryPath}"
  ],
  "output": {
    "ref": "DataNodeId_2jY6v"
  },
  "type": "HiveActivity",
  "stage": "false"
}

I plan to keep the tables unstaged and take care of table creation in the hive script so that it's easier to run each Hive activity in isolation as well as in the pipeline itself. 我计划使表保持未登台状态,并在hive脚本中负责表的创建,以便更容易在隔离环境以及管道本身中运行每个Hive活动。

Here's the error I see when using array syntax: 这是我在使用数组语法时看到的错误:

Unable to resolve input[1].directoryPath for object ActivityId_7u1sR'

从目前的情况来看,不支持这种情况,但是将来添加了功能请求以支持它。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM