简体   繁体   中英

plotly: huge number of datapoints

I am trying to plot something with a huge number of data points (2mm-3mm) using plotly.

When I run

py.iplot(fig, filename='test plot')

I get the following error:

Woah there! Look at all those points! Due to browser limitations, the Plotly SVG drawing functions have a hard time graphing more than 500k data points for line charts, or 40k points for other types of charts. Here are some suggestions:
(1) Use the `plotly.graph_objs.Scattergl` trace object to generate a WebGl graph.
(2) Trying using the image API to return an image instead of a graph URL
(3) Use matplotlib
(4) See if you can create your visualization with fewer data points

If the visualization you're using aggregates points (e.g., box plot, histogram, etc.) you can disregard this warning.

So then I try to save it with this:

py.image.save_as(fig, 'my_plot.png')

But then I get this error:

PlotlyRequestError: Unknown Image Server Error

How do I do this properly? I don't care if it's a still image or an interactive display within my notebook.

one option would be down-sampling your data, not sure if you'd like that: https://github.com/devoxi/lttb-py

I also have problems with plotly in the browser with large datasets - if anyone has solutions, please write! Thank you!

Plotly really seems to be very bad in this. I am just trying to create a boxplot with 5 Million points, which is no problem in the simple R function "boxplot", but plotly is calculating endlessly for this.

It should be a major issue to improve this. Not all data has to be saved (and shown) in the plotly object. This is the main problem I guess.

You can try the render_mode argument. Example:

import plotly.express as px
import pandas as pd
import numpy as np

N = int(1e6) # Number of points

df = pd.DataFrame(dict(x=np.random.randn(N),
                       y=np.random.randn(N)))

fig = px.scatter(df, x="x", y="y", render_mode='webgl')
fig.update_traces(marker_line=dict(width=1, color='DarkSlateGray'))
fig.show()

In my computer N=1e6 takes about 5 seconds until the plot is visible, and the "interactiveness" is still very good. With N=10e6 it takes about 1 minute and the plot is not responsive anymore (ie it is really slow to zoom, pan or anything).

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM