[英]Properly handling Escape Characters in Boto3
I have a S3 Bucket Streaming logs to a lambda function that tags files based on some logic.我有一个 S3 Bucket Streaming 日志到 lambda function,它根据一些逻辑标记文件。
While I have worked around this issue in the past and I understand there are some characters that need to be handled I'm wondering if there is a safe way to handle this with some API or is it something I need to handle on my own.虽然我过去曾解决过这个问题并且我知道有一些字符需要处理,但我想知道是否有一种安全的方法来处理这个问题 API 还是我需要自己处理。
For example I have a lambda function like so:例如我有一个 lambda function 像这样:
import boto3
def lambda_handler(event, context):
s3 = boto3.client("s3")
for record in event["Records"]:
bucket = record["s3"]["bucket"]["name"]
objectName = record["s3"]["object"]["key"]
tags = []
if "Pizza" in objectName:
tags.append({"Key" : "Project", "Value" : "Great"})
if "Hamburger" in objectName:
tags.append({"Key" : "Project", "Value" : "Good"})
if "Liver" in objectName:
tags.append({"Key" : "Project", "Value" : "Yuck"})
s3.put_object_tagging(
Bucket=bucket,
Key=objectName,
Tagging={
"TagSet" : tags
}
)
return {
'statusCode': 200,
}
This code works great.这段代码效果很好。 I upload a file to s3 called Pizza-Is-Better-Than-Liver.txt
then the function runs and tags the file with both Great
and Yuck
(sorry for the strained example).我将一个名为Pizza-Is-Better-Than-Liver.txt
的文件上传到 s3,然后 function 运行并用Great
和Yuck
标记该文件(对于紧张的示例感到抱歉)。
However If I upload the file Pizza Is+AmazeBalls.txt
things go sideways:但是,如果我上传文件Pizza Is+AmazeBalls.txt
东西 go 横向:
Looking at the event in CloudWatch the object key shows as: Pizza+Is%2BAmazeBalls.txt
.查看 CloudWatch 中的事件,object 密钥显示为: Pizza+Is%2BAmazeBalls.txt
。
Obviously the space is escaped to a +
and the +
to a %2B
when I pass that key to put_object_tagging()
it fails with a NoSuchKey
Error.显然,当我将该键传递给put_object_tagging()
时,空间被转义为+
, +
转义为%2B
,它因NoSuchKey
错误而失败。
My question;我的问题; is there a defined way to deal with escaped characters in boto3 or some other sdk, or do I just need to do it myself?是否有定义的方法来处理 boto3 或其他一些 sdk 中的转义字符,或者我只需要自己做? I really don't and to add any modules to the function and I could just use do a contains / replace(), but it's odd I would get something back that I can't immediately use without some transformation.我真的不知道并向 function 添加任何模块,我可以只使用 do a contains / replace(),但奇怪的是我会得到一些东西,如果不进行一些转换我无法立即使用。
I'm not uploading the files and can't mandate what they call things (i-have-tried-but-it-fails), if it's a valid Windows or Mac filename it should work (I get that is a whole other issue but I can deal with that).我没有上传文件,也不能强制执行他们所说的东西(我试过但失败了),如果它是有效的 Windows 或 Mac 文件名,它应该可以工作(我知道那是另一个问题但我可以处理)。
Since no other answers I guess I post my bandaid:由于没有其他答案,我想我张贴了我的创可贴:
def format_path(path):
path = path.replace("+", " ")
path = path.replace("%21", "!")
path = path.replace("%24", "$")
path = path.replace("%26", "&")
path = path.replace("%27", "'")
path = path.replace("%28", "(")
path = path.replace("%29", ")")
path = path.replace("%2B", "+")
path = path.replace("%40", "@")
path = path.replace("%3A", ":")
path = path.replace("%3B", ";")
path = path.replace("%2C", ",")
path = path.replace("%3D", "=")
path = path.replace("%3F", "?")
return path
I'm sure there is a simpler, more complete way to do this but this seems to work... for now.我确信有一种更简单、更完整的方法可以做到这一点,但这似乎有效……目前。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.