My requirement is i have 10 million csv records and i want to export the csv to DynamoDB? Any one could you please help on this. And also is this possible to export tab separated values as well ?
Thanks, in advance.
Convert your csv to json format and use AWS BatchWriteItem DynamoDB API
Make sure to add your primary key data in the json
import csv
import boto3
def convert_csv_to_json_list(file):
items = []
with open(file) as csvfile:
reader = csv.DictReader(csvfile)
for row in reader:
data = {}
data['temp'] = row['temp']
#populate remaining fields here
#................
items.append(data)
return items
def batch_write(items):
dynamodb = boto3.resource('dynamodb')
db = dynamodb.Table('table-name')
with db.batch_writer() as batch:
for item in items:
batch.put_item(Item=item)
if __name__ == '__main__':
json_data = convert_csv_to_json_list('file')
batch_write(json_data)
Use AWS BatchWriteItem DynamoDB API to perform batch inserts.
Iterate the file contents and insert them in batches.
import boto3
dynamodb = boto3.resource('dynamodb')
table = dynamodb.Table('sample_table')
with table.batch_writer() as batch:
for i in range(50):
batch.put_item(
Item={
'ORDERNO': 'dummy',
'DIRECTION': 'dummy',
'LATITUDE': 'dummy',
'LONGITUDE': 'dummy'
}
)
Not so good approach but without any coding
AWS Datapipeline has templates for doing data migration across different AWS services but for dynamodb, it can only load dynamodb backup data and not using csv.
Not straight forward approach, but you can do.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.