简体繁体中英

Query execution time with small batches vs entire input set

原文 2022-09-15 03:52:36 1 1 python/ arangodb

I'm using ArangoDB 3.9.2 for search task. The number of items in dataset is 100.000. When I pass the entire dataset as an input list to the engine - the execution time is around ~10 sec, which is pretty quick. But if I pass the dataset in small batches one by one - 100 items per batch, the execution time is rapidly growing. In this case, to process the full dataset takes about ~2 min. Could you explain please, why is it happening? The dataset is the same.

I'm using python driver "ArangoClient" from python-arango lib ver 0.2.1

PS: I had the similar problem with Neo4j, but the problem was solved using transactions committing with HTTP API. Does the ArangoDB have something similar?

1 answers

Every time you make a call to a remote system (Neo4J or ArangoDB or any database) there is overhead in making the connection, sending the data, and then after executing your command, tearing down the connection.

What you're doing is trying to find the 'sweet spot' for your implementation as to the most efficient batch size for the type of data you are sending, the complexity of your query, the performance of your hardware, etc.

What I recommend doing is writing a test script that sends the data in varying batch sizes to help you determine the optimal settings for your use case.

I have taken this approach with many systems that I've designed and the optimal batch sizes are unique to each implementation. It totally depends on what you are doing.

See what results you get for the overall load time if you use batch sizes of 100, 1000, 2000, 5000, and 10000.

This way you'll work out the best answer for you.

Python vs Javascript execution time

Expected execution time vs actual execution time in python

execution time of mongodb find query

Django get query execution time

Difference in execution time for a query in Python

dynamodb simple query execution time

Python SQL query execution time

Python beginner: how to reduce execution time of this small program?

Difference between Sqlalchemy execution time and execution time from EXPLAIN query?

Python & MongoDb: Query not working at execution time

暂无

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

Related Question Python vs Javascript execution time Expected execution time vs actual execution time in python execution time of mongodb find query Django get query execution time Difference in execution time for a query in Python dynamodb simple query execution time Python SQL query execution time Python beginner: how to reduce execution time of this small program? Difference between Sqlalchemy execution time and execution time from EXPLAIN query? Python & MongoDb: Query not working at execution time

Related Tags

Query execution time with small batches vs entire input set

Question

1 answers

solution1 0 2022-09-16 04:52:22

solution1
0 2022-09-16 04:52:22