简体   繁体   English

查询远程数据库时性能下降 - 这是正常的吗?

[英]Slow performance when querying remote db - is this normal?

I am doing some loading from an oracle db, using ODP.NET. 我正在使用ODP.NET从oracle db进行一些加载。

In its current implementation the code does something along the lines of: 在其当前的实现中,代码执行以下操作:

query entityIds to load based on criteria  
    foreach entityId
        load attributes 
        query geometries that exist
        foreach geometry that exists
            load geometry
        next
    next

when the DB is on the local network criteria which load 133 entities takes a couple of seconds to load all 133 entities. 当DB处于本地网络标准时,负载133实体需要几秒钟来加载所有133个实体。

When the db is a remote db hosted on a VM in a data center on the other side of the world this takes about 3.5 mins to load them all. 当db是托管在世界另一端的数据中心的VM上的远程数据库时,这需要大约3.5分钟来加载它们。

The particularly slow bit seems to be the querying the geometry. 特别慢的位似乎是查询几何。 In initial testing (in TOAD - not in the service loading code) it seems to take about 2 secs to load the geometry for a single entity using the remote machine. 在初始测试中(在TOAD中 - 不在服务加载代码中),使用远程机器加载单个实体的几何图形似乎需要大约2秒。 If we change the query to load all the geometries in a single go, it still seems to take 2 secs. 如果我们将查询更改为一次加载所有几何,它似乎仍然需要2秒。 This sort of implies that it isn't the network overhead (as the amount of data being returned is much more for the query which returns all the geometries, but the time is the same). 这种意味着它不是网络开销(因为返回的数据量对于返回所有几何的查询来说要多得多,但时间是相同的)。

Is this sort of performance overhead for a remote db vs local expected? 对于远程数据库与本地预期的这种性能开销是什么? Why does doing each query separately take so much longer than doing them all in one go? 为什么单独执行每个查询所花费的时间比一次完成所有查询要长得多? Is there anything we can do to mitigate this (apart from do all the queries in one go)? 我们可以做些什么来缓解这种情况(除了一次性完成所有查询)?

You're probably coming up on the distinction between bandwidth and latency. 您可能正在考虑带宽和延迟之间的区别。

Latency is the time taken for a single round-trip, whereas bandwidth is the amount of data that can flow through over a given time period (eg 1 second). 延迟是单次往返所花费的时间,而带宽是在给定时间段(例如1秒)内可以流过的数据量。

If you're running 200 queries (from client-side code, not from a stored proc), then no matter how much data goes in each query, you will get 200 round-trips 如果您正在运行200个查询(来自客户端代码,而不是来自存储过程),那么无论每个查询中有多少数据,您都将获得200次往返

Normal latency for the other end of the world is around half a second I believe - so for 200 entities retrieved separately, about 100 seconds. 我相信世界另一端的正常延迟大概是半秒钟 - 所以对于200个单独检索的实体,大约100秒。

Those numbers don't quite match yours, so there may be even higher latency (depends on all sorts of network factors). 这些数字与您的数字并不完全匹配,因此可能存在更高的延迟(取决于各种网络因素)。 I would normally look for query/lookup overhead on the database server (assuming there is an indexing issue), but you've already mentioned that locally there is no significant overhead (presumably with the same data?). 我通常会在数据库服务器上寻找查询/查找开销(假设存在索引问题),但是您已经提到本地没有显着的开销(可能是使用相同的数据?)。

You're noticing network latency . 您注意到了网络延迟

There needs to be at least one round-trip with the server each time you make a query. 每次进行查询时,服务器至少需要进行一次往返。

If the server is "far", ie 500ms ping, that means at minimum one second of latency per query. 如果服务器是“远”,即500毫秒ping,这意味着每个查询至少有一秒的延迟。 This is uncompressible - even if the query returns no rows, that 1s hit will happen. 这是不可压缩的 - 即使查询没有返回任何行,也会发生1s命中。

The bandwidth of the network is an unrelated characteristic. 网络的带宽是无关的特征。 If your bandwidth is high, you wont notice a big difference between transferring a large dataset and a small one. 如果您的带宽很高,您不会注意到传输大型数据集与小型数据集之间存在很大差异。 But both will still suffer the latency hit in exactly the same way. 但两者仍将以完全相同的方式遭受延迟命中。

I found this article has interesting (if dated) information: It's the latency, Stupid . 我发现这篇文章有趣(如果过时)信息: 这是延迟,愚蠢

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM