在 PyQtGraph 中绘制大时间序列时使用预下采样数据

Question

I need to plot a large time series in PyQtGraph (millions of points).我需要在 PyQtGraph 中绘制一个大的时间序列（数百万点）。 Plotting it as is is practically impossible and when turning on the optimization options (downsampling using setDownsampling and clipping using setClipToView ) it is still barely usable when zoomed out (only when zoomed in it becomes fast thanks to clipping).将其按原样绘制实际上是不可能的，并且在打开优化选项时（使用 setDownsampling 进行下采样并使用setDownsampling进行setClipToView ）在缩小时仍然几乎无法使用（只有在放大时，由于裁剪，它才会变得更快）。

I have an idea though.不过我有个主意。 I could pre-downsample my data since they're static.我可以对我的数据进行预下采样，因为它们是静态的。 Then, I could use the cached downsampled data when zoomed out and the raw data when zoomed in.然后，我可以在缩小时使用缓存的下采样数据，在放大时使用原始数据。

How can I achieve that?我怎样才能做到这一点？

Answer 1

I've done something like this in a project I work on called runviewer .我在一个名为runviewer的项目中做过类似的事情。 The general idea is to resample the data whenever the x-range of the plot changes.一般的想法是在绘图的 x 范围发生变化时重新采样数据。 The approximate method we use is:我们使用的近似方法是：

Connect a method to the sigXRangeChanged signal of the PlotWidget which sets a boolean flag indicating the data needs to be resampled.将一个方法连接到PlotWidget的sigXRangeChanged信号，该信号设置一个布尔标志，指示数据需要重新采样。
Start a thread which polls the boolean flag every x seconds (we chose 0.5 seconds) to see if resampling needs to be done on the data.启动一个线程，每 x 秒（我们选择 0.5 秒）轮询布尔标志，以查看是否需要对数据进行重新采样。 If yes, the data is resampled using an algorithm of your choice (we wrote our own in C).如果是，则使用您选择的算法对数据进行重新采样（我们用 C 编写了自己的算法）。 This data is then posted back to the main thread (eg use a QThread and emit a signal back to the main thread) where a call to pyqtgraph is made to update the data in the plot (note, you can only call pyqtgraph methods from the main thread!)然后将此数据发送回主线程（例如，使用QThread并向主线程发出信号），其中调用 pyqtgraph 以更新图中的数据（注意，您只能从主线！）

We use the boolean flag to decouple the x-range change events from the resampling.我们使用布尔标志将 x 范围变化事件与重采样分离。 You don't want to resample every time the x-range changes, as the signal is fired many many times when you zoom with a mouse and you don't want to generate a queue of resample calls as resampling is slow, even with C!您不想在每次 x 范围更改时重新采样，因为当您使用鼠标缩放时会多次触发信号，并且您不想生成重新采样调用队列，因为重新采样很慢，即使使用 C ！

You also need to make sure your resample thread immediately sets the boolean flag to False if it detects it to be True, and then runs the resampling algorithm.您还需要确保您的重采样线程在检测到它为 True 时立即将布尔标志设置为 False，然后运行重采样算法。 This is so that subsequent x-range change events during the current resampling result in a subsequent resampling.这是为了使当前重采样期间的后续 x 范围变化事件导致后续重采样。

You could also probably improve this by not polling a flag, but using some sort of threading Event/Condition.您也可以通过不轮询标志，而是使用某种线程事件/条件来改进这一点。

Note that resampling with Python is really, really slow, which is why we chose to write the resampling algorithm C, and call that from Python.请注意，使用 Python 进行重采样非常非常慢，这就是我们选择编写重采样算法 C 并从 Python 调用它的原因。 numpy is mostly in C, so will be fast. numpy 主要是在 C 中，所以会很快。 However I don't think they had a feature preserving resampling algorithm.但是我认为他们没有保留特征的重采样算法。 Most resampling people do is just standard downsampling where you take every Nth point, but we wanted to still be able to see the presence of features smaller than the sampling size when zoomed out.大多数人做的重采样只是标准的下采样，你每第 N 个点取一次，但我们希望在缩小时仍然能够看到小于采样大小的特征的存在。

Additional comments on performance对性能的补充意见

I suspect that part of the performance problem with the built-in method of pyqtgraph is that the downsampling is done in the main thread.我怀疑pyqtgraph内置方法的部分性能问题是下采样是在主线程中完成的。 Thus the down-sampling has to be completed before the graph becomes responsive again to user input.因此，必须在图形再次响应用户输入之前完成下采样。 Our method avoids that.我们的方法避免了这种情况。 Our approach also limits the number of times the down-sampling occurs, to at most, once every the length of time it takes to down-sample + the poll delay seconds.我们的方法还将下采样发生的次数限制为最多下采样the length of time it takes to down-sample + the poll delay秒数一次。 So with the delay we use, we only downsample every 0.5-1 second while keeping the main thread (and thus UI) responsive.因此，使用我们使用的延迟，我们仅每 0.5-1 秒进行一次下采样，同时保持主线程（以及 UI）响应。 It does mean that the user might see coarsely sampled data if they zoom in quickly, that but that is corrected in at most 2 iterations of resampling (so at most 1-2 seconds delay).这确实意味着如果用户快速放大，他们可能会看到粗略采样的数据，但这会在最多 2 次重新采样迭代中得到纠正（因此最多延迟 1-2 秒）。 Also, because it takes a short amount of time to correct, the updating/redrawing with the newly sampled data is often done after the user has finished interacting with the UI, so they don't notice any unresponsiveness during the redraw.此外，由于需要很短的时间来纠正，使用新采样的数据进行更新/重绘通常是在用户完成与 UI 的交互后完成的，因此他们不会注意到重绘期间的任何无响应。

Obviously times I'm quoting are completely dependent on the speed of the resampling and the poll delay!显然，我引用的时间完全取决于重新采样的速度和轮询延迟！

Answer 2

The answer by @three_pineapples describes a really nice improvement over the default downsampling in PyQtGraph, but it still requires performing the downsampling on the fly, which in my case is problematic. @three_pineapples 的回答描述了对 PyQtGraph 中默认下采样的一个非常好的改进，但它仍然需要动态执行下采样，这在我的情况下是有问题的。

Therefore, I decided to implement a different strategy, namely, pre-downsample the data and then select either the already downsampled data or the original data depending on the "zoom level".因此，我决定实施不同的策略，即对数据进行预下采样，然后根据“缩放级别”选择已经下采样的数据或原始数据。

I combine that approach with the default auto-downsample strategy employed natively by PyQtGraph to yield further speed improvements (which could be further improved with @three_pineapples suggestions).我将该方法与 PyQtGraph 本机采用的默认自动下采样策略相结合，以进一步提高速度（可以通过@three_pineapples 的建议进一步改进）。

This way, PyQtGraph always starts with data of much lower dimensionality, which makes zooming and panning instantaneous even with a really large amount of samples.这样，PyQtGraph 总是从低维数的数据开始，即使有大量样本，也可以立即进行缩放和平移。

My approach is summarized in this code, which monkey patches the getData method of PlotDataItem .我的方法总结在这段代码中，猴子修补了PlotDataItem的 getData 方法。

# Downsample data
downsampled_data = downsample(data, 100)

# Replacement for the default getData function
def getData(obj):
    # Calculate the visible range
    range = obj.viewRect()
    if range is not None:
        dx = float(data[-1, 0] - data[0, 0]) / (data.size[0] - 1)
        x0 = (range.left() - data[0, 0]) / dx
        x1 = (range.right() - data[0, 0]) / dx
    # Decide whether to use downsampled or original data
    if (x1 - x0) > 20000:
        obj.xData = downsampled_data[:, 0]
        obj.yData = downsampled_data[:, 1]
    else:
        obj.xData = data[:, 0]
        obj.yData = data[:, 1]
    # Run the original getData of PlotDataItem
    return PlotDataItem.getData(obj)

# Replace the original getData with our getData
plot_data_item.getData = types.MethodType(getData, plot_data_item)

在 PyQtGraph 中绘制大时间序列时使用预下采样数据

问题描述

2 个解决方案

解决方案1
3 2015-05-28 23:46:05

解决方案2
1 已采纳 2015-05-29 20:36:06

在 PyQtGraph 中绘制大时间序列时使用预下采样数据

问题描述

2 个解决方案

解决方案1 3 2015-05-28 23:46:05

解决方案2 1 已采纳 2015-05-29 20:36:06

解决方案1
3 2015-05-28 23:46:05

解决方案2
1 已采纳 2015-05-29 20:36:06