如何從 Amazon S3 存儲桶中讀取數據並調用 AWS 服務

Question

我能夠調用 AWS Textract 從我的本地路徑讀取圖像。 我如何集成此 textract 代碼以使用 S3 存儲桶代碼讀取上傳到創建的 S3 存儲桶的圖像。

工作 Textract 代碼從本地路徑 textract 圖像

package aws.cloud.work;

import java.io.File;
import java.io.FileInputStream;
import java.io.FileWriter;
import java.io.IOException;
import java.nio.ByteBuffer;
import java.io.InputStream;

import org.json.simple.JSONArray;
import org.json.simple.JSONObject;

import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.regions.Regions;
import com.amazonaws.services.textract.AmazonTextract;
import com.amazonaws.services.textract.AmazonTextractClientBuilder;
import com.amazonaws.services.textract.model.DetectDocumentTextRequest;
import com.amazonaws.services.textract.model.DetectDocumentTextResult;
import com.amazonaws.services.textract.model.Document;
import com.amazonaws.util.IOUtils;

public class TextractDemo {

    static AmazonTextractClientBuilder clientBuilder = AmazonTextractClientBuilder.standard()
            .withRegion(Regions.US_EAST_1);

    private static FileWriter file;

    public static void main(String[] args) throws IOException {

//AWS Credentials to access AWS Textract services

        clientBuilder.setCredentials(new AWSStaticCredentialsProvider(
                new BasicAWSCredentials("Access Key", "Secret key")));

//Set the path of the image to be textract. Can be configured to use from S3

      String document="C:\\Users\\image-local-path\\sampleTT.jpg";
      ByteBuffer imageBytes;

//Code to use AWS Textract services

        try (InputStream inputStream = new FileInputStream(new File(document))) {
            imageBytes = ByteBuffer.wrap(IOUtils.toByteArray(inputStream));
        }
        AmazonTextract client = clientBuilder.build();
        DetectDocumentTextRequest request = new DetectDocumentTextRequest()
                .withDocument(new Document().withBytes(imageBytes));

        /*
         * DetectDocumentTextResult result = client.detectDocumentText(request);
         * System.out.println(result); result.getBlocks().forEach(block ->{
         * if(block.getBlockType().equals("LINE")) System.out.println("text is "+
         * block.getText() + " confidence is "+ block.getConfidence());
         */ 

//      
        DetectDocumentTextResult result = client.detectDocumentText(request);
        System.out.println(result);
        JSONObject obj = new JSONObject();
        result.getBlocks().forEach(block -> {
            if (block.getBlockType().equals("LINE"))
                System.out.println("text is " + block.getText() + " confidence is " + block.getConfidence());
            JSONArray fields = new JSONArray();

            fields.add(block.getText() + " , " + block.getConfidence());
            obj.put(block.getText(), fields);

        });

//To import the results into JSON file and output the console output as sample.txt      
        try {
            file = new FileWriter("/Users/output-path/sample.txt");
            file.write(obj.toJSONString());
        } catch (IOException e) {
            e.printStackTrace();
        } finally {
            try {
                file.flush();
                file.close();
            } catch (IOException e) {
                // TODO Auto-generated catch block
                e.printStackTrace();
            }
        }

    }
}

這是返回“文本”和相應“置信度分數”的控制台示例

我設法從文檔中找到的 S3 存儲桶代碼集成：

        String document = "sampleTT.jpg";
        String bucket = "textract-images";

        AmazonS3 s3client = AmazonS3ClientBuilder.standard()
                .withEndpointConfiguration( 
                        new EndpointConfiguration("https://s3.amazonaws.com","us-east-1"))
                .build();
        
               
        // Get the document from S3
        com.amazonaws.services.s3.model.S3Object s3object = s3client.getObject(bucket, document);
        S3ObjectInputStream inputStream = s3object.getObjectContent();
        BufferedImage image = ImageIO.read(inputStream);

（已編輯）- 感謝 @smac2020，我目前有一個有效的 Rekognition 代碼，它從我的 AWS 控制台 S3 存儲桶中讀取並運行我引用的 Rekognition 服務。 但是，我無法修改它並將其與 Textract 源代碼合並

package com.amazonaws.samples;

import com.amazonaws.auth.AWSCredentialsProvider;
import com.amazonaws.auth.AWSStaticCredentialsProvider;
import com.amazonaws.auth.BasicAWSCredentials;
import com.amazonaws.services.rekognition.AmazonRekognition;
import com.amazonaws.services.rekognition.AmazonRekognitionClientBuilder;
import com.amazonaws.services.rekognition.model.AmazonRekognitionException;
import com.amazonaws.services.rekognition.model.DetectLabelsRequest;
import com.amazonaws.services.rekognition.model.DetectLabelsResult;
import com.amazonaws.services.rekognition.model.Image;
import com.amazonaws.services.rekognition.model.Label;
import com.amazonaws.services.rekognition.model.S3Object;
import java.util.List;

public class DetectLabels {

 public static void main(String[] args) throws Exception {

    String photo = "sampleTT.jpg";
    String bucket = "Textract-bucket";

    
    
//    AmazonRekognition rekognitionClient = AmazonRekognitionClientBuilder.standard().withRegion("ap-southeast-1").build();

    AWSCredentialsProvider credentialsProvider = new AWSStaticCredentialsProvider (new BasicAWSCredentials("Access Key", "Secret Key"));
    AmazonRekognition rekognitionClient = AmazonRekognitionClientBuilder.standard().withCredentials(credentialsProvider).withRegion("ap-southeast-1").build();

    
    DetectLabelsRequest request = new DetectLabelsRequest()
         .withImage(new Image()
         .withS3Object(new S3Object()
         .withName(photo).withBucket(bucket)))
         .withMaxLabels(10)
         .withMinConfidence(75F);

    try {
       DetectLabelsResult result = rekognitionClient.detectLabels(request);
       List <Label> labels = result.getLabels();

       System.out.println("Detected labels for " + photo);
       for (Label label: labels) {
          System.out.println(label.getName() + ": " + label.getConfidence().toString());
       }
    } catch(AmazonRekognitionException e) {
       e.printStackTrace();
    }
 }
}

Answer 1

看起來您正在嘗試從 Spring 引導應用程序讀取 Amazon S3 object，然后將該字節數組傳遞給DetectDocumentTextRequest 。

有一個教程顯示了一個非常相似的用例，其中 Spring BOOT 應用程序從 Amazon S3 object 讀取字節並將其傳遞給 Amazon Rekognition 服務（而不是 Textract）。

Java 代碼為：

// Get the byte[] from this AWS S3 object.
public byte[] getObjectBytes (String bucketName, String keyName) {

    s3 = getClient();

    try {
        GetObjectRequest objectRequest = GetObjectRequest
                .builder()
                .key(keyName)
                .bucket(bucketName)
                .build();
        
        ResponseBytes<GetObjectResponse> objectBytes = s3.getObjectAsBytes(objectRequest);
        byte[] data = objectBytes.asByteArray();
        return data;

    } catch (S3Exception e) {
        System.err.println(e.awsErrorDetails().errorMessage());
        System.exit(1);
    }
    return null;
}

請參閱這篇 AWS 開發文章，了解如何構建具有此功能的 Spring BOOT 應用程序。

使用 AWS SDK 為 Java 創建示例 AWS 照片分析器應用程序

此示例使用AWS SDK For Java V2 。 如果您不熟悉使用最新的 SDK 版本，我建議您從這里開始：

開始使用適用於 Java 2.x 的 AWS SDK

如何從 Amazon S3 存儲桶中讀取數據並調用 AWS 服務

問題描述

1 個解決方案

解決方案1
0 2021-08-13 13:18:19

如何從 Amazon S3 存儲桶中讀取數據並調用 AWS 服務

問題描述

1 個解決方案

解決方案1 0 2021-08-13 13:18:19

解決方案1
0 2021-08-13 13:18:19