简体   繁体   中英

DynamoDB advanced scan - JAVA

I need to scan the tables by subfield JSON, which is located in one column. Unfortunately I can't find anywhere example in Java and do not know if it is possible.

This is my data and this json represents object - one row in dynamodb. The json represents 3 java classes: - main class which contains the class city and some string record - city contains a class road

Is it possible to scan the database and find the records with mainName = "xyz" and having a city record called "Rockingham"

{
"Id": "9",
"mainName": "xyz",
"floatVariable": 228.3,
"city": [
{
  "name": "Rockingham",
  "road": [
    {
      "roadName": "Traci",
      "roadLength": 118
    },
    {
      "roadName": "Watkins",
      "roadLength": 30
    }
  ]
 }
],

"house": { "huseName": "Wendy Carson" } }

I have some like this and this work but this is not enough to query correct data. Table table = dynamoDB.getTable(tableName);

        Map<String, Object> expressionAttributeValues = new HashMap<String, Object>();
        expressionAttributeValues.put(":pr", 300);

        ItemCollection<ScanOutcome> items = table.scan(
                "floatVariable < :pr", //FilterExpression
                "Id, mainName, floatVariable, city" , //ProjectionExpression
                null, //ExpressionAttributeNames - not used in this example
                expressionAttributeValues);

        System.out.println("Scan of " + tableName + " for items with a price less than 300.");
        Iterator<Item> iterator = items.iterator();
        while (iterator.hasNext()) {
            System.out.println(iterator.next().toJSONPretty());
        }

I saw an example in php something like this but unfortunately it does not work in Java.

    ItemCollection<ScanOutcome> items = table.scan(
            " cites.name = :dupa  ", //FilterExpression
            "Id, mainName, floatVariable, city", //ProjectionExpression
            null, //ExpressionAttributeNames - not used in this example
            expressionAttributeValues);

Is the city attribute a list of varying length? If you want to use server side filtering, you'll need to enumerate each of the elements of the list you want to check.

Alternatively, you can maintain a separate list of city names and use the "contains" operator on that attribute.

If you are querying by city.name your data model has to take that into consideration. I would suggest having one city per table item:

{
"Id": "9",
"mainName": "xyz",
"cityName": "Rockingham",
"floatVariable": 228.3,
"road": [
    {
      "roadName": "Traci",
      "roadLength": 118
    },
    {
      "roadName": "Watkins",
      "roadLength": 30
    }
  ]
 }
]}

The Hash Key would be the cityName attribute, and the Range Key any other attribute that would make the primary key (hash + range key) unique, for instance : Id .

QuerySpec querySpec = new QuerySpec()
                        .withHashKey("cityName", "Rockingham")
                        .withProjectionExpression("Id, mainName, floatVariable, road");

ItemCollection<QueryOutcome> items = table.query(querySpec);

As a second option you could define two tables:

Table A

Primary Key Type: Hash Key + Range Key

Hash Key : cityName Range Key : Id (Reference to Table B item)

{
    "cityName": "Rockingham",
    "Id" : 9,
    "road": [
        {
          "roadName": "Traci",
          "roadLength": 118
        },
        {
          "roadName": "Watkins",
          "roadLength": 30
        }
      ]
     }
    ]}

Table B

Primary Key Type: Hash Key

Hash Key : Id

{
    "Id": "9",
    "mainName": "xyz",
    "floatVariable": 228.3
}

After retrieving the city items you would query the Table B by Id via Query , GetItem or BatchGetItem .

Both options will allow you to use the Query operation instead of Scan , enabling simpler queries with better performance and lower costs:

A Scan operation always scans the entire table or secondary index, then filters out values to provide the desired result, essentially adding the extra step of removing data from the result set. Avoid using a Scan operation on a large table or index with a filter that removes many results, if possible. Also, as a table or index grows, the Scan operation slows. The Scan operation examines every item for the requested values, and can use up the provisioned throughput for a large table or index in a single operation. For faster response times, design your tables and indexes so that your applications can use Query instead of Scan. (For tables, you can also consider using the GetItem and BatchGetItem APIs.).

Source: http://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Scan.html

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM