简体   繁体   中英

AWS textract Extract the meta-data and confidence score

Hi all i have extracted the document meta-data from AWS texttract Asynchronous call using java SDK but the meta-data is segregated into multiple blocks and it's huge.

How to extract the confidence score, value and its field name separately using java code i want to extract result something like below:


[{
  "Field" : "FirstName",
  "Value" : "XXXXX",
  "confidence Score" : "98.88"
},
{
  "Field" : "LastName",
  "Value" : "XXXXX",
  "confidence Score" : "65.98"
}]

Could anyone please suggest how to extract the field,value and its confidence score from aws texttract document meta-data?

anyone having any idea on this?

AWS has provided an example for mapping key and value pairs in python. You can use this code to understand the logic and come up with your own code in JAVA.

Source: https://docs.aws.amazon.com/textract/latest/dg/examples-extract-kvp.html

I have just begun with AWS Textract too in Java and wow what a great tool ! I have included code in my answer at this link if you would like to take a look :)

It extracts the keys and values. I suggest you create a model with Key, Value and confidence scores and then create an object for each key value pair

    public static ArrayList<KVPair> getKVObjects(List<Block> keyMap, List<Block> valueMap, List<Block> blockMap ) {
    ArrayList<KVPair> labelValues = new ArrayList<>();

    Block value_block;


    for (Block key_block : keyMap) {

        value_block = findValueBlock(key_block, valueMap);
        String key = getText(key_block, blockMap);
        Float top = value_block.getGeometry().getBoundingBox().getTop();
        Float left = value_block.getGeometry().getBoundingBox().getLeft();
        Float confidenceScore = value_block.getConfidence();


        Optional<KVPair> label= (labelValues.stream().filter(x-> x.getLabel().equals(key)).findFirst());

        Property property = new Property();
        property.setValue(getText(value_block, blockMap));
        property.setLocationLeft(left);
        property.setLocationTop(top);
        property.setConfidenceScore(confidenceScore);
        if(label.isPresent()){
            label.get().setProperties(property);
        }else{
            KVPair KVPair = new KVPair();
            KVPair.setLabel(key);
            KVPair.setProperties(property);
            labelValues.add(KVPair);
     }



    }

    return labelValues;

}

AWS-Textract-Key-Value-Pair Java - thread "main" java.lang.NullPointerException

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM