简体   繁体   中英

How to parse the AWS S3 Path (s3://<bucket name>/<key>) using the AWSSDK.S3 in C# in order to get the bucket name & key

I have a s3 path => s3://[bucket name]/[key]

s3://bn-complete-dev-test/1234567890/renders/Irradiance_A.png 

and I need get the bucket_name and the key separately:

var s3PathParsed = parseS3Path("s3://bn-complete-dev-test/1234567890/renders/Irradiance_A.png");

s3PathParsed.BucketName == "bn-complete-dev-test"
s3PathParsed.Key == "1234567890/renders/Irradiance_A.png"

how to I could parse in the correct way using the AWS SDK ?

1) I am parsing manually (using a regular expression) and work fine but i am not comfortable :

public class S3Path : IS3Path
{
    private const string _s3PathRegex = @"[s|S]3:\/\/(?<bucket>[^\/]*)\/(?<key>.*)";

    public S3Path(string s3Path)
    {
        Path = s3Path;

        var rx = new Regex(_s3PathRegex).Match(s3Path);

        if (!rx.Success || rx.Groups.Count != 3)
            throw new Exception($"the S3 Path '{s3Path}' is wrong.");

        BucketName = rx.Groups[1].Value;
        Key = rx.Groups[2].Value;
    }

    public string Path { get; }

    public string BucketName { get; }

    public string Key { get; }
}

2) I used the AmazonS3Uri from AWWSDK.S3:

string GetBucketNameFromS3Uri(string s3Uri)
{
    return new AmazonS3Uri(s3Uri).Bucket;            
}

I called the method:

GetBucketNameFromS3Uri("s3://sunsite-complete-dev-test/1234567890/renders/Irradiance_A.png");

and i have the following error:

System.ArgumentException: 'Invalid S3 URI - hostname does not appear to be a valid S3 endpoint'

3) Also I try

string GetBucketNameFromS3Uri(string s3Uri)
{
    return new AmazonS3Uri(new Uri(s3Uri)).Bucket;            
}

with the same error.

I created a new thread in AWS Forum with this issue: https://forums.aws.amazon.com/thread.jspa?threadID=304401

In Java, We can do something like

AmazonS3URI s3URI = new AmazonS3URI("s3://bucket/folder/object.csv");
S3Object s3Object = s3Client.getObject(s3URI.getBucket(), s3URI.getKey());

If you have an object URL ( https://bn-complete-dev-test.s3.eu-west-2.amazonaws.com/1234567890/renders/Irradiance_A.pnlet ), you can use AmazonS3Uri :

// using Amazon.S3.Util

var uri = new AmazonS3Uri(urlString); 

var bucketName = uri.Bucket;
var key = uri.Key;

If you have an S3 URI ( s3://bn-complete-dev-test/1234567890/renders/Irradiance_A.png ) then it is a bit more involved:

using System;

public static class S3
{
    public static Tuple<string, string> TryParseS3Uri(string x)
    {
        try
        {
            var uri = new Uri(x);

            if (uri.Scheme == "s3")
            {
                var bucket = uri.Host;
                var key = uri.LocalPath.Substring(1);

                return new Tuple<string, string>(bucket, key);
            }

            return null;
        }
        catch (Exception ex)
        {
            var ex2 = ex as UriFormatException;

            if (ex2 == null)
            {
                throw ex;
            }

            return null;
        }
    }
}

Here's an F# version:

open System

let tryParseS3Uri (x : string) =
  try
    let uri = Uri x

    if uri.Scheme = "s3"
    then
      let bucket = uri.Host
      let key = uri.LocalPath.Substring 1

      Some (bucket, key)
    else
      None

  with
    | :? UriFormatException -> None
    | exn -> raise exn

Here is the scala version and usage of the regex.

val regex = "s3a://([^/]*)/(.*)".r
val regex(bucketName, key) = "s3a://my-bucket-name/myrootpath/mychildpath/file.json"

println(bucketName) // my-bucket-name
println(key)        // myrootpath/mychildpath/file.json

I believe that this regex will give you what you want:

s3:\/\/(?<bucket>[^\/]*)\/(?<key>.*)

The bucketname is the first part of the S3 path and the key is everything after the first forward slash.

The AWSSDK.S3 has not a path parser, we need parse manually. You could use the following class that work fine:

public class S3Path 
{
    private const string _s3PathRegex = @"[s|S]3:\/\/(?<bucket>[^\/]+)\/(?<key>.+)";

    public S3Path(string s3Path)
    {
        Path = s3Path;

        var rx = new Regex(_s3PathRegex).Match(s3Path);

        if (!rx.Success)
            throw new Exception($"the S3 Path '{s3Path}' is wrong.");

        BucketName = rx.Groups["bucket"].Value;
        Key = rx.Groups["key"].Value;
    }

    public string Path { get; }

    public string BucketName { get; }

    public string Key { get; }
}

I created a thread in AWS Forum to report the missing functionality.

For Javascript version you can use amazon-s3-uri

const AmazonS3URI = require('amazon-s3-uri')
 
try {
  const uri = 'https://bucket.s3-aws-region.amazonaws.com/key'
  const { region, bucket, key } = AmazonS3URI(uri)
} catch((err) => {
  console.warn(`${uri} is not a valid S3 uri`) // should not happen because `uri` is valid in that example
})

The AWSSDK.S3 Nuget Library has a utility method:

if (!Amazon.S3.Util.AmazonS3Uri.TryParseAmazonS3Uri(s3Url, out AmazonS3Uri amazonS3Uri))
  throw new ArgumentOutOfRangeException();
var bucket = amazonS3Uri.Bucket;
var key = amazonS3Uri.Key;
var region = amazonS3Uri.Region;

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM