简体   繁体   English

有没有办法使用 aws s3 ls cli 将 S3 存储桶名称添加到存储桶的递归列表中?

[英]Is there a way to add the S3 bucket name to the recursive list of a bucket using aws s3 ls cli?

I prefer using the aws cli for listing S3 contents;我更喜欢使用 aws cli 列出 S3 内容; it is handy for generating a file of object details that I can sort, grep, and otherwise manipulate later.生成一个包含 object 详细信息的文件很方便,我可以对 grep 进行排序,然后再进行其他操作。

Unfortunately, by default it doesn't put the S3 bucket name in the object name.不幸的是,默认情况下它不会将 S3 存储桶名称放入 object 名称中。 For example, if I want to list a bucket called example, I type and get this:例如,如果我想列出一个名为 example 的存储桶,我输入并得到:

% aws s3 ls s3://example
2021-12-23 15:31:17     8572 object_name
2021-12-22 08:45:23       11 another_object_name

Is there a way to get the aws cli to put the bucket name on each line?有没有办法让 aws cli 将存储桶名称放在每一行? Then I can grep across a file or files covering multiple buckets and see which bucket each object is in.然后我可以 grep 跨越一个或多个覆盖多个桶的文件,并查看每个 object 在哪个桶中。

Like this:像这样:

% aws s3 ls s3://example
2021-12-23 15:31:17     8572 s3://example/object_name
2021-12-22 08:45:23       11 s3://example/another_object_name

I don't see an option in the AWS cli docs to do it, but perhaps someone knows an undocumented flag or something.我在 AWS cli 文档中没有看到执行此操作的选项,但也许有人知道未记录的标志或其他内容。

The AWS CLI offers two sub-commands to interact with S3. AWS CLI 提供了两个子命令来与 S3 交互。 There is the high-level s3 sub-command you're using.您正在使用高级s3子命令。 This command allows very straight forward access to the most common actions on S3 buckets, but is limited in its functionality and doesn't expose all features of the underlying API.此命令允许非常直接地访问 S3 存储桶上最常见的操作,但其功能有限,并且不会公开底层 API 的所有功能。

The other sub-command is s3api , which offers direct access to the S3 API.另一个子命令是s3api ,它提供对 S3 API 的直接访问。 With s3api you're quite flexible regarding the formatting of the output, as you can apply a JMESPath expression before returning it.使用s3api ,您可以非常灵活地处理 output 的格式,因为您可以在返回之前应用JMESPath表达式。

Here is an example which comes close to your desired output.这是一个接近您想要的 output 的示例。 It's not a perfect representation (note the difference in the date format and the alignment of the object sizes), but should be close enough:这不是一个完美的表示(注意日期格式和 object 大小的 alignment 的差异),但应该足够接近:

$ BUCKET_NAME=example aws s3api list-objects-v2 --bucket $BUCKET_NAME \
    --query 'Contents[].[LastModified, Size, join(`/`, [`s3://'$BUCKET_NAME'`, Key])]' \
    --output text
2021-12-23T15:31:17.000Z        8572 s3://example/object_name
2021-12-22T08:45:23.000Z        11   s3://example/another_object_name

I decided it would be easiest to do this in the shell.我决定在 shell 中执行此操作最简单。

That means I used this:这意味着我使用了这个:

aws s3 ls | sed s/....................//|sed 's#.*#BN=&; aws s3 ls s3://& --recursive | sed "s,  , ,g;s,  , ,g;s,  , ,g;s,  , ,g"|sed "s, , s3://$BN/,3"#' | sh -x > filename

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM