简体   繁体   English

如何使用 plotly 为 dask 数据框绘制交互式图

[英]how can I plot Interactive plot for a dask data frame using plotly

I have a Dask data frame that have 30 partitions, (each partition) have 100 million raws of data.我有一个有 30 个分区的 Dask 数据框,(每个分区)有 1 亿个原始数据。 total number of raws in the whole dask dataframe is 400 million raws.整个 dask 数据帧中的原始数据总数为 4 亿原始数据。 I would like to plot all of the Daskframe in one plot using plotly.我想使用 plotly 将所有 Daskframe 绘制在一个图中。 How would I go about achieving this?我将如何实现这一目标? The end result is an over all plot of the data and if I want to explor a specific region of the data I can zoom in and pan.最终结果是数据的整体图,如果我想探索数据的特定区域,我可以放大和平移。 My data size is about 3.5 GB, and they wont fot in memory.我的数据大小约为 3.5 GB,它们不会存储在内存中。

3.5GB is a fairly small size, so should fit in memory. 3.5GB是一个相当小的大小,所以应该适合内存。 If it doesn't, there are a few strategies:如果没有,有一些策略:

  1. reduce data by selecting only columns of interest, load into memory and plot data with standard libraries:通过仅选择感兴趣的列、加载到内存并使用标准库绘制数据来减少数据:
from dask.dataframe import read_parquet

df = read_parquet(path_to_file, columns=specific_cols).compute()
  1. datashader - https://plotly.com/python/datashader/数据着色器 - https://plotly.com/python/datashader/

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM