简体   繁体   English

如何为新文件抓取s3存储桶

[英]How to crawl a s3 bucket for new files

如何抓取s3存储桶以确保是否有任何新文件或对象添加或删除了?

Now you can get a message when there is a change. 现在,当发生更改时,您会收到一条消息。 This was just announced (long overdue): 这是刚刚宣布的(很久之前):
http://aws.amazon.com/blogs/aws/s3-event-notification/ http://aws.amazon.com/blogs/aws/s3-event-notification/

It is very simple to implement - time to throw out all the ugly cron jobs and list-loops. 它的实现非常简单-是时候丢弃所有难看的cron作业和列表循环了。

By usin Java AWS SDK, able to connect to S3 bucket from Java and able to crawl the bucket. 通过使用Java AWS开发工具包,能够从Java连接到S3存储桶并能够对存储桶进行爬网。

Crawling is nothing but taking the objects in a list and comparing and identifying the new object. 进行爬网只不过是将对象列表中并比较和标识新对象。

Example can be found in http://aws.amazon.com/sdk-for-java/ 可以在http://aws.amazon.com/sdk-for-java/找到示例

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM