Using pre-downsampled data when plotting large time series in PyQtGraph

Question

I need to plot a large time series in PyQtGraph (millions of points). Plotting it as is is practically impossible and when turning on the optimization options (downsampling using setDownsampling and clipping using setClipToView ) it is still barely usable when zoomed out (only when zoomed in it becomes fast thanks to clipping).

I have an idea though. I could pre-downsample my data since they're static. Then, I could use the cached downsampled data when zoomed out and the raw data when zoomed in.

How can I achieve that?

Answer 1

I've done something like this in a project I work on called runviewer . The general idea is to resample the data whenever the x-range of the plot changes. The approximate method we use is:

Connect a method to the sigXRangeChanged signal of the PlotWidget which sets a boolean flag indicating the data needs to be resampled.
Start a thread which polls the boolean flag every x seconds (we chose 0.5 seconds) to see if resampling needs to be done on the data. If yes, the data is resampled using an algorithm of your choice (we wrote our own in C). This data is then posted back to the main thread (eg use a QThread and emit a signal back to the main thread) where a call to pyqtgraph is made to update the data in the plot (note, you can only call pyqtgraph methods from the main thread!)

We use the boolean flag to decouple the x-range change events from the resampling. You don't want to resample every time the x-range changes, as the signal is fired many many times when you zoom with a mouse and you don't want to generate a queue of resample calls as resampling is slow, even with C!

You also need to make sure your resample thread immediately sets the boolean flag to False if it detects it to be True, and then runs the resampling algorithm. This is so that subsequent x-range change events during the current resampling result in a subsequent resampling.

You could also probably improve this by not polling a flag, but using some sort of threading Event/Condition.

Note that resampling with Python is really, really slow, which is why we chose to write the resampling algorithm C, and call that from Python. numpy is mostly in C, so will be fast. However I don't think they had a feature preserving resampling algorithm. Most resampling people do is just standard downsampling where you take every Nth point, but we wanted to still be able to see the presence of features smaller than the sampling size when zoomed out.

Additional comments on performance

I suspect that part of the performance problem with the built-in method of pyqtgraph is that the downsampling is done in the main thread. Thus the down-sampling has to be completed before the graph becomes responsive again to user input. Our method avoids that. Our approach also limits the number of times the down-sampling occurs, to at most, once every the length of time it takes to down-sample + the poll delay seconds. So with the delay we use, we only downsample every 0.5-1 second while keeping the main thread (and thus UI) responsive. It does mean that the user might see coarsely sampled data if they zoom in quickly, that but that is corrected in at most 2 iterations of resampling (so at most 1-2 seconds delay). Also, because it takes a short amount of time to correct, the updating/redrawing with the newly sampled data is often done after the user has finished interacting with the UI, so they don't notice any unresponsiveness during the redraw.

Obviously times I'm quoting are completely dependent on the speed of the resampling and the poll delay!

Answer 2

The answer by @three_pineapples describes a really nice improvement over the default downsampling in PyQtGraph, but it still requires performing the downsampling on the fly, which in my case is problematic.

Therefore, I decided to implement a different strategy, namely, pre-downsample the data and then select either the already downsampled data or the original data depending on the "zoom level".

I combine that approach with the default auto-downsample strategy employed natively by PyQtGraph to yield further speed improvements (which could be further improved with @three_pineapples suggestions).

This way, PyQtGraph always starts with data of much lower dimensionality, which makes zooming and panning instantaneous even with a really large amount of samples.

My approach is summarized in this code, which monkey patches the getData method of PlotDataItem .

# Downsample data
downsampled_data = downsample(data, 100)

# Replacement for the default getData function
def getData(obj):
    # Calculate the visible range
    range = obj.viewRect()
    if range is not None:
        dx = float(data[-1, 0] - data[0, 0]) / (data.size[0] - 1)
        x0 = (range.left() - data[0, 0]) / dx
        x1 = (range.right() - data[0, 0]) / dx
    # Decide whether to use downsampled or original data
    if (x1 - x0) > 20000:
        obj.xData = downsampled_data[:, 0]
        obj.yData = downsampled_data[:, 1]
    else:
        obj.xData = data[:, 0]
        obj.yData = data[:, 1]
    # Run the original getData of PlotDataItem
    return PlotDataItem.getData(obj)

# Replace the original getData with our getData
plot_data_item.getData = types.MethodType(getData, plot_data_item)

Using pre-downsampled data when plotting large time series in PyQtGraph

Question

2 answers

solution1
3 2015-05-28 23:46:05

solution2
1 ACCPTED 2015-05-29 20:36:06

Using pre-downsampled data when plotting large time series in PyQtGraph

Question

2 answers

solution1 3 2015-05-28 23:46:05

solution2 1 ACCPTED 2015-05-29 20:36:06

solution1
3 2015-05-28 23:46:05

solution2
1 ACCPTED 2015-05-29 20:36:06