简体   繁体   中英

Textract in R (paws) without S3Object

When using textract from the paws package in R the start_document_analysis call requires the path to a S3Object in DocumentLocation .

textract$start_document_analysis(
    DocumentLocation = list(
      S3Object = list(Bucket = bucket, Name = file)
    )
  )

Is it possible to use DocumentLocation without a S3Object? I would prefer to just provide the path to a local PDF.

The start_document_analysis api only supports providing an s3 object as input, and not a base64 encoded string like the analyze_document api (see also CLI docs on https://docs.aws.amazon.com/cli/latest/reference/textract/start-document-analysis.html )

So unfortunately you have to use S3 as a place to (temporarily) store your data. Of course you can write your own logic to do that :). Great tutorial on that can be found at https://www.gormanalysis.com/blog/connecting-to-aws-s3-with-r/ Since you have already set up credentials etc. you can skip a lot of the steps and start at step 3 for example.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM