Doing a remote grep/count on a file stored on amazon S3

Question

We have a cloud based applicaiton which has been storing user projects on the normal disk of our EC2 server. I am in the process of moving our project storage to S3 but I have recently run into a tough challenge. When a project is modified we sometimes need to run some analysis of the xml files stored in a project. Before we would do this with a grep and a count which would look for certain xml tags, something like this:

grep -o "<tag" "' + path + '" | wc -l

Now that the files are being stored on S3 I am at a loss for how I might be able to do similar analysis (without downloading the whole project which would mostly defeat the purpose of switching to S3). Is there anyway to do this?

Answer 1

Unfortunately S3 doesn't provide that functionality. You have to download the file(s) before grep can be applied (even if you use third party tools like s3cmd, they download the files behind the scene).

If there aren't too many patterns, you can grep the files before you upload and keep the results on local machine. You don't have to hit S3 every time. Yes, you may end up with stale data but the other alternative is expensive.

Doing a remote grep/count on a file stored on amazon S3

Question

1 answers

solution1
2 ACCPTED 2016-02-01 21:11:55

Doing a remote grep/count on a file stored on amazon S3

Question

1 answers

solution1 2 ACCPTED 2016-02-01 21:11:55

solution1
2 ACCPTED 2016-02-01 21:11:55