[英]aws data pipeline datetime variable
I am using AWS Data Pipeline to save a text file to my S3 bucket from RDS. 我正在使用AWS Data Pipeline将文本文件从RDS保存到我的S3存储桶。 I would like the file name to to have the date and the hour in the file name like:
我希望文件名在文件名中包含日期和小时,如:
myfile-YYYYMMDD-HH.txt
myfile-20140813-12.txt
I have specified my S3DataNode FilePath as: 我已将S3DataNode FilePath指定为:
s3://mybucketname/out/myfile-#{format(myDateTime,'YYYY-MM-dd-HH')}.txt
When I try to save my pipeline I get the following error: 当我尝试保存我的管道时,我收到以下错误:
ERROR: Unable to resolve myDateTime for object:DataNodeId_xOQxz
According to the AWS Data Pipeline documentation for date and time functions this is the proper syntax for using the format function. 根据日期和时间函数的AWS Data Pipeline文档,这是使用format函数的正确语法。
When I save pipeline using a "hard-coded" the date and time I don't get this error and my file is in my S3 bucket and folder as expected. 当我使用“硬编码”日期和时间保存管道时,我没有收到此错误,我的文件在预期的S3存储桶和文件夹中。
My thinking is that I need to define "myDateTime" somewhere or use a NOW() 我的想法是我需要在某处定义“myDateTime”或使用NOW()
Can somebody tell me how to set "myDateTime" to the current time (eg NOW) or give a workaround so I can format the current time to be used in my FilePath ? 有人可以告诉我如何将“myDateTime”设置为当前时间(例如NOW)或者给出一个解决方法以便我可以格式化我在FilePath中使用的当前时间吗?
I am not aware of an exact equivalent of NOW() in Data Pipeline. 我不知道数据管道中的NOW()的确切等价物。 I tried using makeDate with no arguments (just for fun) to see if that worked.. it did not.
我尝试使用没有参数的makeDate(只是为了好玩),看看是否有效..它没有。
The closest are runtime variables scheduledStartTime, actualStartTime, reportProgressTime. 最接近的是运行时变量scheduledStartTime,actualStartTime,reportProgressTime。
http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-s3datanode.html http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-s3datanode.html
The following for eg. 以下为例如。 should work.
应该管用。 s3://mybucketname/out/myfile-#{format(@scheduledStartTime,'YYYY-MM-dd-HH')}.txt
S3:// mybucketname /缩小/ myfile-#{格式(@ scheduledStartTime, 'YYYY-MM-DD-HH')}的.txt
Just for fun, here is some more info on Parameters
. 只是为了好玩,这里有一些关于
Parameters
更多信息。
At the end of your Pipeline Json (click List Pipelines
, select into one, click Edit Pipeline
, then click Export
), you need to add a Parameters
and/or Values
object. 在Pipeline Json的末尾(单击
List Pipelines
,选择一个,单击Edit Pipeline
,然后单击Export
),您需要添加一个Parameters
和/或Values
对象。
I use a myStartDate
for backfill processes which you can manipulate once it is passed in for ad hoc runs. 我使用
myStartDate
进行回填过程,一旦传入进行临时运行,您就可以对其进行操作。 You can give this a static default, but can't set it to a dynamic value so it is limited for regular schedule tasks. 您可以为此设置静态默认值,但不能将其设置为动态值,因此它对于常规计划任务是有限的。 For realtime/scheduled dates, you need to use the
@scheduledStartTime
, etc, as suggested. 对于实时/计划日期,您需要按照建议使用
@scheduledStartTime
等。 Here is a sample of setting up some Parameters
and or Values
. 以下是设置一些
Parameters
和/或Values
的示例。 Both show up in Parameters
in the UI. 两者都显示在UI中的
Parameters
中。 These values can be used through out your pipeline activities (shell, hive, etc) with the #{myVariableToUse}
notation. 这些值可以通过
#{myVariableToUse}
表示法在您的管道活动(shell,配置单元等)中使用。
"parameters": [
{
"helpText": "Put help text here",
"watermark": "This shows if no default or value set",
"description": "Label/Desc",
"id": "myVariableToUse",
"type": "string"
}
]
And for Values: 对于价值观:
"values": {
"myS3OutLocation": "s3://some-bucket/path",
"myThreshold": "30000",
}
You cannot add these directly in the UI (yet) but once they are there you can change and save the values. 您无法直接在UI中添加这些内容,但一旦在那里,您就可以更改并保存这些值。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.