[英]Can you get a count of S3 objects with a given prefix without iterating through them?
[英]List All objects in S3 with given Prefix in scala
我正在尝试使用以下代码列出 AWS S3 存储桶中具有输入存储桶名称和过滤器前缀的所有对象。
import scala.collection.JavaConverters._
import com.amazonaws.services.s3.AmazonS3Client
import com.amazonaws.services.s3.model.ListObjectsV2Request
val bucket_name = "Mybucket"
val fiter_prefix = "Test/a/"
def list_objects(str: String): mutable.Buffer[String] = {
val request : ListObjectsV2Request = new ListObjectsV2Request().withBucketName(bucket_name).withPrefix(str)
var result: ListObjectsV2Result = new ListObjectsV2Result()
do {
result = s3_client.listObjectsV2(request)
val token = result.getNextContinuationToken
System.out.println("Next Continuation Token: " + token)
request.setContinuationToken(token)
}while(result.isTruncated)
result.getObjectSummaries.asScala.map(_.getKey).size
}
list_objects(fiter_prefix)
我已经应用了延续方法,但我只是得到了最后一个 object 列表。 例如,前缀有 2210 个对象,我只取回 210 个对象。
问候鲯鳅鱼
这是对我有用的代码。
import scala.collection.JavaConverters._
import com.amazonaws.services.s3.AmazonS3Client
import com.amazonaws.services.s3.model.ListObjectsV2Request
val bucket_name = "Mybucket"
val fiter_prefix = "Test/a/"
def list_objects(str: String): List[String] = {
val s3_client = new AmazonS3Client
var final_list: List[String] = List()
var list: List[String] = List()
val request: ListObjectsV2Request = new ListObjectsV2Request().withBucketName(bucket_name).withPrefix(str)
var result: ListObjectsV2Result = new ListObjectsV2Result()
do {
result = s3_client.listObjectsV2(request)
val token = result.getNextContinuationToken
System.out.println("Next Continuation Token: " + token)
request.setContinuationToken(token)
list = (result.getObjectSummaries.asScala.map(_.getKey)).toList
println(list.size)
final_list = final_list ::: list
println(final_list)
} while (result.isTruncated)
println("size", final_list.size)
final_list
}
list_objects(fiter_prefix)
使用香草 Scala 避免变量和尾递归的解决方案:
import software.amazon.awssdk.regions.Region
import software.amazon.awssdk.services.s3.S3Client
import software.amazon.awssdk.services.s3.model.{ListObjectsV2Request,
ListObjectsV2Response}
import scala.annotation.tailrec
import scala.collection.JavaConverters.asScalaBufferConverter
import scala.collection.mutable
import scala.collection.mutable.ListBuffer
val sourceBucket = "yourbucket"
val sourceKey = "yourKey"
val subFolderPrefix = "yourprefix"
def getAllPaths(s3Client: S3Client, initReq: ListObjectsV2Request): List[String] = {
@tailrec
def listAllObjectsV2(
s3Client: S3Client,
req: ListObjectsV2Request,
tokenOpt: Option[String],
isFirstTime: Boolean,
initList: ListBuffer[String]
): ListBuffer[String] = {
println(s"IsFirstTime = ${isFirstTime}, continuationToken = ${tokenOpt}")
(isFirstTime, tokenOpt) match {
case (true, Some(x)) =>
// this combo is not possible..
initList
case (false, None) =>
// end
initList
case (_, _) =>
// possible scenarios are :
// true, None : First iteration
// false, Some(x): Second iteration onwards
val response =
s3Client.listObjectsV2(tokenOpt.fold(req)(token => req.toBuilder.continuationToken(token).build()))
val keys: Seq[String] = response.contents().asScala.toList.map(_.key())
val nextTokenOpt = Option(response.nextContinuationToken())
listAllObjectsV2(s3Client, req, nextTokenOpt, isFirstTime = false, keys ++: initList)
}
}
listAllObjectsV2(s3Client, initReq, None, true, mutable.ListBuffer.empty[String]).toList
}
val s3Client = S3Client.builder().region(Region.US_WEST_2).build()
val request: ListObjectsV2Request =
ListObjectsV2Request.builder
.bucket(sourceBucket)
.prefix(sourceKey + "/" + subFolderPrefix)
.build
val listofAllKeys: List[String] = getAllPaths(s3Client, request)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.