简体   繁体   中英

Plotting Large Datasets in IPython Notebook (Bokeh)

I have a large dataset that I would like to plot in an IPython notebook.

I read the ~0.5GB.csv file into a Pandas DataFrame using read_csv , this takes about two minutes. Then I try to plot this data.

data = pd.read_csv('large.csv')
output_notebook()
p1 = figure()
p1.circle(data.index, data['myDataset'])
show(p1)

My browser spins and does not show me any plots. I have tried the following:

  1. output_file() instead of output_notebook()
  2. Graphing using a ColumnSource object as the source argument to circle()
  3. Downsampling my data to something more manageable.

Bokeh claims on its website to offer "high-performance interactivity over very large or streaming datasets". How do I visualize these large datasets without my computer grinding to a halt?

The question is too broad to offer any specific code suggestions. I would be curious what the size of the downsampling you tried was. The default HTML Canvas for Bokeh can definitely accommodate tens of thousands of circles. There are a few options:

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM