简体   繁体   中英

Doing a remote grep/count on a file stored on amazon S3

We have a cloud based applicaiton which has been storing user projects on the normal disk of our EC2 server. I am in the process of moving our project storage to S3 but I have recently run into a tough challenge. When a project is modified we sometimes need to run some analysis of the xml files stored in a project. Before we would do this with a grep and a count which would look for certain xml tags, something like this:

grep -o "<tag" "' + path + '" | wc -l

Now that the files are being stored on S3 I am at a loss for how I might be able to do similar analysis (without downloading the whole project which would mostly defeat the purpose of switching to S3). Is there anyway to do this?

Unfortunately S3 doesn't provide that functionality. You have to download the file(s) before grep can be applied (even if you use third party tools like s3cmd, they download the files behind the scene).

If there aren't too many patterns, you can grep the files before you upload and keep the results on local machine. You don't have to hit S3 every time. Yes, you may end up with stale data but the other alternative is expensive.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM