简体繁体 English

搜索延迟是否会随着文档大小而增加？

[英]Does search latency increase with the document size?

原文 2020-06-23 09:24:45 9 2 vespa

Does the search latency increase when data keeps on growing in a document type?当数据在文档类型中不断增长时，搜索延迟是否会增加？ As we don't directly manage shard type configurations in Vespa, how does it manage it?由于我们不直接管理 Vespa 中的分片类型配置，它是如何管理的？

Is creating multiple document types a good practice for handling scaling requirements?创建多种文档类型是处理扩展需求的好习惯吗？

2 个解决方案

Vespa distributes documents evenly (using the CRUSH algorithm) over the available nodes in the content cluster. Vespa 将文档均匀地分布在内容集群中的可用节点上（使用 CRUSH 算法）。 If you add (or remove) nodes in a cluster, Vespa will automatically redistribute in the background.如果您在集群中添加（或删除）节点，Vespa 将在后台自动重新分配。

Typically, latency is proportional to the number of documents per content node, adding more content nodes reduces latency.通常，延迟与每个内容节点的文档数量成正比，添加更多内容节点会减少延迟。 You can do that at any point while in production.您可以在生产过程中的任何时候执行此操作。

As you can see from this, you never want to add more search definition (schemas) to scale.从这里可以看出，您永远不想添加更多的搜索定义（模式）来扩展。

See https://docs.vespa.ai/documentation/performance/sizing-search.html .请参阅https://docs.vespa.ai/documentation/performance/sizing-search.html 。 Yes, generally if your queries are text queries, latency increases with increasing document volume given fixed number of nodes.是的，通常如果您的查询是文本查询，则延迟会随着节点数量固定的情况下文档量的增加而增加。 Vespa allows live re-distribution of data so adding new nodes will balance the latency. Vespa 允许实时重新分发数据，因此添加新节点将平衡延迟。