I have a large dataset that I would like to plot in an IPython notebook.
I read the ~0.5GB.csv file into a Pandas DataFrame
using read_csv
, this takes about two minutes. Then I try to plot this data.
data = pd.read_csv('large.csv')
output_notebook()
p1 = figure()
p1.circle(data.index, data['myDataset'])
show(p1)
My browser spins and does not show me any plots. I have tried the following:
output_file()
instead of output_notebook()
ColumnSource
object as the source
argument to circle()
Bokeh claims on its website to offer "high-performance interactivity over very large or streaming datasets". How do I visualize these large datasets without my computer grinding to a halt?
The question is too broad to offer any specific code suggestions. I would be curious what the size of the downsampling you tried was. The default HTML Canvas for Bokeh can definitely accommodate tens of thousands of circles. There are a few options:
for simple scatters and lines of hundreds of thousands of points, there is a WebGL backend that may be useful.
using the Bokeh Server, create a Bokeh app to downsample the data before rendering it. There are some app examples here:
The DataShader library can be used to perform downsampling of large data sets (hundreds of millions to billions of points), and integrates very well with Bokeh.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.