简体   繁体   English

aws数据管道日期时间变量

[英]aws data pipeline datetime variable

I am using AWS Data Pipeline to save a text file to my S3 bucket from RDS. 我正在使用AWS Data Pipeline将文本文件从RDS保存到我的S3存储桶。 I would like the file name to to have the date and the hour in the file name like: 我希望文件名在文件名中包含日期和小时,如:

myfile-YYYYMMDD-HH.txt
myfile-20140813-12.txt

I have specified my S3DataNode FilePath as: 我已将S3DataNode FilePath指定为:

s3://mybucketname/out/myfile-#{format(myDateTime,'YYYY-MM-dd-HH')}.txt

When I try to save my pipeline I get the following error: 当我尝试保存我的管道时,我收到以下错误:

ERROR: Unable to resolve myDateTime for object:DataNodeId_xOQxz

According to the AWS Data Pipeline documentation for date and time functions this is the proper syntax for using the format function. 根据日期和时间函数AWS Data Pipeline文档,这是使用format函数的正确语法。

When I save pipeline using a "hard-coded" the date and time I don't get this error and my file is in my S3 bucket and folder as expected. 当我使用“硬编码”日期和时间保存管道时,我没有收到此错误,我的文件在预期的S3存储桶和文件夹中。

My thinking is that I need to define "myDateTime" somewhere or use a NOW() 我的想法是我需要在某处定义“myDateTime”或使用NOW()

Can somebody tell me how to set "myDateTime" to the current time (eg NOW) or give a workaround so I can format the current time to be used in my FilePath ? 有人可以告诉我如何将“myDateTime”设置为当前时间(例如NOW)或者给出一个解决方法以便我可以格式化我在FilePath中使用的当前时间吗?

I am not aware of an exact equivalent of NOW() in Data Pipeline. 我不知道数据管道中的NOW()的确切等价物。 I tried using makeDate with no arguments (just for fun) to see if that worked.. it did not. 我尝试使用没有参数的makeDate(只是为了好玩),看看是否有效..它没有。

The closest are runtime variables scheduledStartTime, actualStartTime, reportProgressTime. 最接近的是运行时变量scheduledStartTime,actualStartTime,reportProgressTime。

http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-s3datanode.html http://docs.aws.amazon.com/datapipeline/latest/DeveloperGuide/dp-object-s3datanode.html

The following for eg. 以下为例如。 should work. 应该管用。 s3://mybucketname/out/myfile-#{format(@scheduledStartTime,'YYYY-MM-dd-HH')}.txt S3:// mybucketname /缩小/ myfile-#{格式(@ scheduledStartTime, 'YYYY-MM-DD-HH')}的.txt

Just for fun, here is some more info on Parameters . 只是为了好玩,这里有一些关于Parameters更多信息。

At the end of your Pipeline Json (click List Pipelines , select into one, click Edit Pipeline , then click Export ), you need to add a Parameters and/or Values object. 在Pipeline Json的末尾(单击List Pipelines ,选择一个,单击Edit Pipeline ,然后单击Export ),您需要添加一个Parameters和/或Values对象。

I use a myStartDate for backfill processes which you can manipulate once it is passed in for ad hoc runs. 我使用myStartDate进行回填过程,一旦传入进行临时运行,您就可以对其进行操作。 You can give this a static default, but can't set it to a dynamic value so it is limited for regular schedule tasks. 您可以为此设置静态默认值,但不能将其设置为动态值,因此它对于常规计划任务是有限的。 For realtime/scheduled dates, you need to use the @scheduledStartTime , etc, as suggested. 对于实时/计划日期,您需要按照建议使用@scheduledStartTime等。 Here is a sample of setting up some Parameters and or Values . 以下是设置一些Parameters和/或Values的示例。 Both show up in Parameters in the UI. 两者都显示在UI中的Parameters中。 These values can be used through out your pipeline activities (shell, hive, etc) with the #{myVariableToUse} notation. 这些值可以通过#{myVariableToUse}表示法在您的管道活动(shell,配置单元等)中使用。

"parameters": [
{
  "helpText": "Put help text here",
  "watermark": "This shows if no default or value set",
  "description": "Label/Desc",
  "id": "myVariableToUse",
  "type": "string"
}
]

And for Values: 对于价值观:

"values": {
  "myS3OutLocation": "s3://some-bucket/path",
  "myThreshold": "30000",
}

You cannot add these directly in the UI (yet) but once they are there you can change and save the values. 您无法直接在UI中添加这些内容,但一旦在那里,您就可以更改并保存这些值。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM