简体   繁体   English

正确处理 Boto3 中的转义字符

[英]Properly handling Escape Characters in Boto3

I have a S3 Bucket Streaming logs to a lambda function that tags files based on some logic.我有一个 S3 Bucket Streaming 日志到 lambda function,它根据一些逻辑标记文件。

While I have worked around this issue in the past and I understand there are some characters that need to be handled I'm wondering if there is a safe way to handle this with some API or is it something I need to handle on my own.虽然我过去曾解决过这个问题并且我知道有一些字符需要处理,但我想知道是否有一种安全的方法来处理这个问题 API 还是我需要自己处理。

For example I have a lambda function like so:例如我有一个 lambda function 像这样:

import boto3

def lambda_handler(event, context):
    s3 = boto3.client("s3")

    for record in event["Records"]:
        bucket = record["s3"]["bucket"]["name"]
        objectName = record["s3"]["object"]["key"]

        tags = []
        
        if "Pizza" in objectName:
            tags.append({"Key" : "Project", "Value" : "Great"})
        if "Hamburger" in objectName:
            tags.append({"Key" : "Project", "Value" : "Good"})
        if "Liver" in objectName:
            tags.append({"Key" : "Project", "Value" : "Yuck"})

        s3.put_object_tagging(
            Bucket=bucket,
            Key=objectName,
            Tagging={
                "TagSet" : tags
            }
        )

    
    return {
        'statusCode': 200,
    }

This code works great.这段代码效果很好。 I upload a file to s3 called Pizza-Is-Better-Than-Liver.txt then the function runs and tags the file with both Great and Yuck (sorry for the strained example).我将一个名为Pizza-Is-Better-Than-Liver.txt的文件上传到 s3,然后 function 运行并用GreatYuck标记该文件(对于紧张的示例感到抱歉)。

However If I upload the file Pizza Is+AmazeBalls.txt things go sideways:但是,如果我上传文件Pizza Is+AmazeBalls.txt东西 go 横向:

Looking at the event in CloudWatch the object key shows as: Pizza+Is%2BAmazeBalls.txt .查看 CloudWatch 中的事件,object 密钥显示为: Pizza+Is%2BAmazeBalls.txt

Obviously the space is escaped to a + and the + to a %2B when I pass that key to put_object_tagging() it fails with a NoSuchKey Error.显然,当我将该键传递给put_object_tagging()时,空间被转义为++转义为%2B ,它因NoSuchKey错误而失败。

My question;我的问题; is there a defined way to deal with escaped characters in boto3 or some other sdk, or do I just need to do it myself?是否有定义的方法来处理 boto3 或其他一些 sdk 中的转义字符,或者我只需要自己做? I really don't and to add any modules to the function and I could just use do a contains / replace(), but it's odd I would get something back that I can't immediately use without some transformation.我真的不知道并向 function 添加任何模块,我可以只使用 do a contains / replace(),但奇怪的是我会得到一些东西,如果不进行一些转换我无法立即使用。

I'm not uploading the files and can't mandate what they call things (i-have-tried-but-it-fails), if it's a valid Windows or Mac filename it should work (I get that is a whole other issue but I can deal with that).我没有上传文件,也不能强制执行他们所说的东西(我试过但失败了),如果它是有效的 Windows 或 Mac 文件名,它应该可以工作(我知道那是另一个问题但我可以处理)。

Since no other answers I guess I post my bandaid:由于没有其他答案,我想我张贴了我的创可贴:

def format_path(path):
    path = path.replace("+", " ")
    path = path.replace("%21", "!")
    path = path.replace("%24", "$")
    path = path.replace("%26", "&")
    path = path.replace("%27", "'")
    path = path.replace("%28", "(")
    path = path.replace("%29", ")")
    path = path.replace("%2B", "+")
    path = path.replace("%40", "@")
    path = path.replace("%3A", ":")
    path = path.replace("%3B", ";")
    path = path.replace("%2C", ",")
    path = path.replace("%3D", "=")
    path = path.replace("%3F", "?")
    return path

I'm sure there is a simpler, more complete way to do this but this seems to work... for now.我确信有一种更简单、更完整的方法可以做到这一点,但这似乎有效……目前。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM