简体繁体 English

调用 DetectPiiEntities 操作时出现 TextSizeLimitExceededException

[英]TextSizeLimitExceededException when calling the DetectPiiEntities operation

原文 2021-08-10 19:36:04 1 1 amazon-web-services/ nlp/ entity/ amazon-comprehend

I am using aws comprehend for PII redaction, Idea is to detect entities and then redact PII from it.我正在使用 aws comprehend 进行 PII 编辑，想法是检测实体，然后从中编辑 PII。

Now the problem is this API has a Input text size limit.现在的问题是这个 API 有输入文本大小限制。 How can I increase the limit??我怎样才能增加限制？ Maybe to 1 MB??也许到 1 MB？ Or is there any other way to detect entities for large text.或者是否有任何其他方法来检测大文本的实体。

ERROR : botocore.errorfactory.TextSizeLimitExceededException: An error occurred (TextSizeLimitExceededException) when calling the DetectPiiEntities operation: Input text size exceeds limit. Max length of request text allowed is 5000 bytes while in this request the text size is 7776 bytes错误： botocore.errorfactory.TextSizeLimitExceededException: An error occurred (TextSizeLimitExceededException) when calling the DetectPiiEntities operation: Input text size exceeds limit. Max length of request text allowed is 5000 bytes while in this request the text size is 7776 bytes botocore.errorfactory.TextSizeLimitExceededException: An error occurred (TextSizeLimitExceededException) when calling the DetectPiiEntities operation: Input text size exceeds limit. Max length of request text allowed is 5000 bytes while in this request the text size is 7776 bytes

1 个解决方案

There's no way to increase this limit.没有办法增加这个限制。 For input text greater than 5000 bytes, you can split the text into multiple chunks of 5000 bytes each and then aggregate the results back.对于大于 5000 字节的输入文本，您可以将文本拆分为多个块，每个块 5000 字节，然后将结果聚合回来。 Please do mind that you keep some overlap between different chunks, to carry over some context from previous chunk.请注意，您在不同的块之间保留一些重叠，以继承前一个块的一些上下文。

For reference you can use similar solution exposed by Comprehend team itself.作为参考，您可以使用 Comprehend 团队本身公开的类似解决方案。 https://github.com/aws-samples/amazon-comprehend-s3-object-lambda-functions/blob/main/src/processors.py#L172 https://github.com/aws-samples/amazon-comprehend-s3-object-lambda-functions/blob/main/src/processors.py#L172