简体   繁体   English

获取集合中所有文档的最快方法是什么?

[英]What is the fastest way to get all documents of a collection?

I have a problem.我有个问题。 I want to get all documents of a collection with ~ 1 mio documents inside.我想获取一个集合的所有文档,里面有 ~ 1 个 mio 文档。 I asked myself what is the fastest way to get all documents inside a collection.我问自己将所有文档放入集合中的最快方法是什么。 Is it with cursor or with .all ?cursor还是.all And are there any recommendation for the batch_size ?batch_size有什么建议吗?

cursor

from arango import ArangoClient

# Initialize the ArangoDB client.
client = ArangoClient()

# Connect to database as  user.
db = client.db(<db>, username=<username>, password=<password>)

cursor = db.aql.execute('FOR doc IN <Collection> RETURN doc', stream=True, ttl=3600, batch_size=<batchSize>)
collection =  [doc for doc in cursor]

.all - with custom HTTP Client .all - 自定义 HTTP 客户端

from arango import ArangoClient
from arango.http import HTTPClient

class MyCustomHTTPClient(HTTPClient):
    REQUEST_TIMEOUT = 1000

# Initialize the ArangoDB client.
client = ArangoClient(
    http_client=MyCustomHTTPClient())

# Connect to database as  user.
db = client.db(<db>, username=<username>, password=<password>)

collec = db.collection('<Collection>')
collection = collec.all()

If you want all documents in the memory then the .all will be the fastest because it uses the library's method for getting all the results which is optimized.如果您想要 memory 中的所有文档,那么.all将是最快的,因为它使用库的方法来获取所有优化的结果。

If you can process each document as they come in then the cursor is the best way to do it to avoid the memory overhead.如果您可以在每个文档进入时对其进行处理,那么cursor是避免 memory 开销的最佳方法。

But the best way to decide this is to run tests measure the timing because many factors can effect the speed, such as the connection type and speed to the DB, amount of memory in your computer, etc. The examples you gave look simple enough to do such measurements pretty fast.但决定这一点的最佳方法是运行测试来测量时间,因为许多因素都会影响速度,例如连接类型和数据库速度、计算机中 memory 的数量等。您给出的示例看起来很简单做这样的测量非常快。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM