简体   繁体   中英

What is better on Performance when Querying 50 GB data ? Is it MYSQL SELECT with a condition or Dynamodb SCAN with FiLTER Expressions?

I'm retrieving some traffic data of a website using "scan" option in Dynamodb. I have used filterExpression to filter those out. I will be doing scanning against a large table which will have more than 20GB of data.

I found that DynamoDB scans throguh the entire table and filter the results out. The document says it only returns 1MB of data and then i have to loop through again to get the rest. It seems to be bad way to make this work. got the reference from here: Dynamodb filter expression not returning all results

For a small table that should be fine.

MySQL dose the same I guess. I'm not sure.

Which is faster to read is it MySQL select or DynamoDB scan on a large set of data. ?

Is there any other alternative? what are your thoughts and suggestions?

I'm trying to migrate those traffic data into Dynamodb table and then query it out. It seems like a bad idea to me now.

$params = [
      'TableName' => $tableName,
      'FilterExpression' => $this->filter.'=:'.$this->filter.' AND #dy > :since AND #dy < :now',
      'ExpressionAttributeNames'=> [ '#dy' => 'day' ],
      'ExpressionAttributeValues'=> $eav
    ];

    var_dump($params);

    try {
      $result = $dynamodb->scan($params);

After considering the suggestion this is what worked for me

$params = [ 
'TableName' => $tableName,
 'IndexName' => self::GLOBAL_SECONDARY_INDEX_NAME, 
'ProjectionExpression' => '#dy, t_counter , traffic_type_id', 'KeyConditionExpression' => 'country=:country AND #dy between :since AND :to', 
'FilterExpression' => 'traffic_type_id=:traffic_type_id' 'ExpressionAttributeNames' => ['#dy' => 'day'],
'ExpressionAttributeValues' => $eav 
]; 

If your data is like Key-Value pair and you have fixed fields on which you want to index, use DynamoDB - you can create indexes on all fields you want to query and it will work great

If you require complex querying on multiple indexes, then any RDBMS is good.

If you can query on just about anything, think about Elastic search

If your queries are very simple, but you have large data to be retrieved in each query. Think about S3 . Maybe you can index metadata in DynamoDb and actual data can be in S3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM