正确处理 Boto3 中的转义字符

Question

I have a S3 Bucket Streaming logs to a lambda function that tags files based on some logic.我有一个 S3 Bucket Streaming 日志到 lambda function，它根据一些逻辑标记文件。

While I have worked around this issue in the past and I understand there are some characters that need to be handled I'm wondering if there is a safe way to handle this with some API or is it something I need to handle on my own.虽然我过去曾解决过这个问题并且我知道有一些字符需要处理，但我想知道是否有一种安全的方法来处理这个问题 API 还是我需要自己处理。

For example I have a lambda function like so:例如我有一个 lambda function 像这样：

import boto3

def lambda_handler(event, context):
    s3 = boto3.client("s3")

    for record in event["Records"]:
        bucket = record["s3"]["bucket"]["name"]
        objectName = record["s3"]["object"]["key"]

        tags = []
        
        if "Pizza" in objectName:
            tags.append({"Key" : "Project", "Value" : "Great"})
        if "Hamburger" in objectName:
            tags.append({"Key" : "Project", "Value" : "Good"})
        if "Liver" in objectName:
            tags.append({"Key" : "Project", "Value" : "Yuck"})

        s3.put_object_tagging(
            Bucket=bucket,
            Key=objectName,
            Tagging={
                "TagSet" : tags
            }
        )

    
    return {
        'statusCode': 200,
    }

This code works great.这段代码效果很好。 I upload a file to s3 called Pizza-Is-Better-Than-Liver.txt then the function runs and tags the file with both Great and Yuck (sorry for the strained example).我将一个名为Pizza-Is-Better-Than-Liver.txt的文件上传到 s3，然后 function 运行并用Great和Yuck标记该文件（对于紧张的示例感到抱歉）。

However If I upload the file Pizza Is+AmazeBalls.txt things go sideways:但是，如果我上传文件Pizza Is+AmazeBalls.txt东西 go 横向：

Looking at the event in CloudWatch the object key shows as: Pizza+Is%2BAmazeBalls.txt .查看 CloudWatch 中的事件，object 密钥显示为： Pizza+Is%2BAmazeBalls.txt 。

Obviously the space is escaped to a + and the + to a %2B when I pass that key to put_object_tagging() it fails with a NoSuchKey Error.显然，当我将该键传递给put_object_tagging()时，空间被转义为+ ， +转义为%2B ，它因NoSuchKey错误而失败。

My question;我的问题; is there a defined way to deal with escaped characters in boto3 or some other sdk, or do I just need to do it myself?是否有定义的方法来处理 boto3 或其他一些 sdk 中的转义字符，或者我只需要自己做？ I really don't and to add any modules to the function and I could just use do a contains / replace(), but it's odd I would get something back that I can't immediately use without some transformation.我真的不知道并向 function 添加任何模块，我可以只使用 do a contains / replace()，但奇怪的是我会得到一些东西，如果不进行一些转换我无法立即使用。

I'm not uploading the files and can't mandate what they call things (i-have-tried-but-it-fails), if it's a valid Windows or Mac filename it should work (I get that is a whole other issue but I can deal with that).我没有上传文件，也不能强制执行他们所说的东西（我试过但失败了），如果它是有效的 Windows 或 Mac 文件名，它应该可以工作（我知道那是另一个问题但我可以处理）。

Answer 1

Since no other answers I guess I post my bandaid:由于没有其他答案，我想我张贴了我的创可贴：

def format_path(path):
    path = path.replace("+", " ")
    path = path.replace("%21", "!")
    path = path.replace("%24", "$")
    path = path.replace("%26", "&")
    path = path.replace("%27", "'")
    path = path.replace("%28", "(")
    path = path.replace("%29", ")")
    path = path.replace("%2B", "+")
    path = path.replace("%40", "@")
    path = path.replace("%3A", ":")
    path = path.replace("%3B", ";")
    path = path.replace("%2C", ",")
    path = path.replace("%3D", "=")
    path = path.replace("%3F", "?")
    return path

I'm sure there is a simpler, more complete way to do this but this seems to work... for now.我确信有一种更简单、更完整的方法可以做到这一点，但这似乎有效……目前。

正确处理 Boto3 中的转义字符

问题描述

1 个解决方案

解决方案1
0 2022-03-01 14:49:09

正确处理 Boto3 中的转义字符

问题描述

1 个解决方案

解决方案1 0 2022-03-01 14:49:09

解决方案1
0 2022-03-01 14:49:09