简体   繁体   English

DynamoDB高级扫描-JAVA

[英]DynamoDB advanced scan - JAVA

I need to scan the tables by subfield JSON, which is located in one column. 我需要通过位于一列中的子字段JSON扫描表。 Unfortunately I can't find anywhere example in Java and do not know if it is possible. 不幸的是,我在Java的任何地方都找不到示例,也不知道是否有可能。

This is my data and this json represents object - one row in dynamodb. 这是我的数据,这个json代表对象-dynamodb中的一行。 The json represents 3 java classes: - main class which contains the class city and some string record - city contains a class road json代表3个Java类:-包含class city和一些字符串记录的主类-city包含类road

Is it possible to scan the database and find the records with mainName = "xyz" and having a city record called "Rockingham" 是否可以扫描数据库并找到mainName =“ xyz”并具有名为“ Rockingham”的城市记录的记录

{
"Id": "9",
"mainName": "xyz",
"floatVariable": 228.3,
"city": [
{
  "name": "Rockingham",
  "road": [
    {
      "roadName": "Traci",
      "roadLength": 118
    },
    {
      "roadName": "Watkins",
      "roadLength": 30
    }
  ]
 }
],

"house": { "huseName": "Wendy Carson" } } “ house”:{“ huseName”:“ Wendy Carson”}}

I have some like this and this work but this is not enough to query correct data. 我有一些这样的工作,但这不足以查询正确的数据。 Table table = dynamoDB.getTable(tableName); 表格= dynamoDB.getTable(tableName);

        Map<String, Object> expressionAttributeValues = new HashMap<String, Object>();
        expressionAttributeValues.put(":pr", 300);

        ItemCollection<ScanOutcome> items = table.scan(
                "floatVariable < :pr", //FilterExpression
                "Id, mainName, floatVariable, city" , //ProjectionExpression
                null, //ExpressionAttributeNames - not used in this example
                expressionAttributeValues);

        System.out.println("Scan of " + tableName + " for items with a price less than 300.");
        Iterator<Item> iterator = items.iterator();
        while (iterator.hasNext()) {
            System.out.println(iterator.next().toJSONPretty());
        }

I saw an example in php something like this but unfortunately it does not work in Java. 我在php中看到了类似这样的示例,但不幸的是,它在Java中不起作用。

    ItemCollection<ScanOutcome> items = table.scan(
            " cites.name = :dupa  ", //FilterExpression
            "Id, mainName, floatVariable, city", //ProjectionExpression
            null, //ExpressionAttributeNames - not used in this example
            expressionAttributeValues);

Is the city attribute a list of varying length? 城市属性是长度不一的清单吗? If you want to use server side filtering, you'll need to enumerate each of the elements of the list you want to check. 如果要使用服务器端过滤,则需要枚举要检查的列表的每个元素。

Alternatively, you can maintain a separate list of city names and use the "contains" operator on that attribute. 另外,您可以维护一个单独的城市名称列表,并在该属性上使用“包含”运算符。

If you are querying by city.name your data model has to take that into consideration. 如果按city.name查询, city.name数据模型必须考虑到这一点。 I would suggest having one city per table item: 我建议每个表项有一个城市:

{
"Id": "9",
"mainName": "xyz",
"cityName": "Rockingham",
"floatVariable": 228.3,
"road": [
    {
      "roadName": "Traci",
      "roadLength": 118
    },
    {
      "roadName": "Watkins",
      "roadLength": 30
    }
  ]
 }
]}

The Hash Key would be the cityName attribute, and the Range Key any other attribute that would make the primary key (hash + range key) unique, for instance : Id . 哈希键将是cityName属性,范围键将是任何其他使主键(哈希+范围键)唯一的属性,例如: Id

QuerySpec querySpec = new QuerySpec()
                        .withHashKey("cityName", "Rockingham")
                        .withProjectionExpression("Id, mainName, floatVariable, road");

ItemCollection<QueryOutcome> items = table.query(querySpec);

As a second option you could define two tables: 第二种选择是,您可以定义两个表:

Table A 表A

Primary Key Type: Hash Key + Range Key 主键类型:哈希键+范围键

Hash Key : cityName Range Key : Id (Reference to Table B item) 哈希键: cityName范围键: Id (参考表B项)

{
    "cityName": "Rockingham",
    "Id" : 9,
    "road": [
        {
          "roadName": "Traci",
          "roadLength": 118
        },
        {
          "roadName": "Watkins",
          "roadLength": 30
        }
      ]
     }
    ]}

Table B 表B

Primary Key Type: Hash Key 主键类型:哈希键

Hash Key : Id 哈希键: Id

{
    "Id": "9",
    "mainName": "xyz",
    "floatVariable": 228.3
}

After retrieving the city items you would query the Table B by Id via Query , GetItem or BatchGetItem . 检索城市项目后,您可以通过QueryGetItemBatchGetItem通过ID查询表B。

Both options will allow you to use the Query operation instead of Scan , enabling simpler queries with better performance and lower costs: 这两个选项都将允许您使用Query操作而不是Scan ,从而使更简单的查询具有更好的性能和更低的成本:

A Scan operation always scans the entire table or secondary index, then filters out values to provide the desired result, essentially adding the extra step of removing data from the result set. 扫描操作始终扫描整个表或二级索引,然后过滤出值以提供所需的结果,从本质上增加了从结果集中删除数据的额外步骤。 Avoid using a Scan operation on a large table or index with a filter that removes many results, if possible. 如果可能,请避免对带有过滤器的大型表或索引使用“扫描”操作,该过滤器会删除许多结果。 Also, as a table or index grows, the Scan operation slows. 另外,随着表或索引的增长,扫描操作也会变慢。 The Scan operation examines every item for the requested values, and can use up the provisioned throughput for a large table or index in a single operation. 扫描操作检查每个项目的请求值,并且可以在单个操作中用完大表或索引的预配置吞吐量。 For faster response times, design your tables and indexes so that your applications can use Query instead of Scan. 为了缩短响应时间,请设计表和索引,以便您的应用程序可以使用“查询”而不是“扫描”。 (For tables, you can also consider using the GetItem and BatchGetItem APIs.). (对于表,您还可以考虑使用GetItem和BatchGetItem API。)

Source: http://docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Scan.html 来源: http : //docs.aws.amazon.com/amazondynamodb/latest/APIReference/API_Scan.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM