简体   繁体   English

在 R 中高效绘制数亿个点

[英]Efficiently plotting hundreds of millions of points in R

Is plot() the most efficient way to plot 100 million or so data points in R? plot()是在 R 中绘制 1 亿个左右数据点的最有效方法吗? I'd like to plot a bunch of these Clifford Attractors .我想绘制一堆这些Clifford Attractors Here's an example of one I've downscaled from a very large image:这是我从非常大的图像缩小的示例:

克利福德吸引子

Here is a link to some code that I've used to plot a very large 8K (7680x4320) images.是我用来绘制非常大的 8K (7680x4320) 图像的一些代码的链接。

It doesn't take long to generate 50 or 100 million points (using Rcpp), nor to get the hex value for the colour + transparency, but the actual plotting and saving to disk is extremely slow.生成 50 或 1 亿个点(使用 Rcpp)不需要很长时间,也不需要获取颜色 + 透明度的十六进制值,但实际绘制和保存到磁盘的速度非常慢。

  • Is there a faster way to plot (and save) all these points?有没有更快的方法来绘制(并保存)所有这些点?
  • Is R just a bad tool for this job? R 只是这项工作的坏工具吗?
  • What tools would you use to plot billions points, even if you couldn't fit them all in to ram?即使您无法将它们全部放入 ram,您会使用什么工具来绘制数十亿个点?
  • How would one have made a very high resolution plot of this type (colour + transparency) with say 1990's software and hardware?使用 1990 年代的软件和硬件如何制作这种类型(颜色 + 透明度)的高分辨率绘图?

Edit: code used编辑:使用的代码

# Load packages
library(Rcpp)
library(viridis)

# output parameters
output_width = 1920 * 4
output_height = 1080 * 4
N_points = 50e6
point_alpha = 0.05 #point transperancy

# Attractor parameters
params <- c(1.886,-2.357,-0.328, 0.918)

# C++ function to rapidly generate points
cliff_rcpp <- cppFunction(
    "
    NumericMatrix cliff(int nIter, double A, double B, double C, double D) {
    NumericMatrix x(nIter, 2);
    for (int i=1; i < nIter; ++i) {
    x(i,0) = sin(A*x(i-1,1)) + C*cos(A*x(i-1,0));
    x(i,1) = sin(B*x(i-1,0)) + D*cos(B*x(i-1,1));
    }
    return x;
    }"
)

# Function for mapping a point to a colour
map2color <- function(x, pal, limits = NULL) {
    if (is.null(limits))
        limits = range(x)
    pal[findInterval(x,
                     seq(limits[1], limits[2], length.out = length(pal) + 1),
                     all.inside = TRUE)]
}

# Obtain matrix of points
cliff_points <- cliff_rcpp(N_points, params[1], params[2], params[3], params[4])

# Calculate angle between successive points
cliff_angle <- atan2(
    (cliff_points[, 1] - c(cliff_points[-1, 1], 0)),
    (cliff_points[, 2] - c(cliff_points[-1, 2], 0))
)

# Obtain colours for points
available_cols <-
    viridis(
        1024,
        alpha = point_alpha,
        begin = 0,
        end = 1,
        direction = 1
    )

cliff_cols <- map2color(
    cliff_angle,
    c(available_cols, rev(available_cols))
)


# Output image directly to disk
jpeg(
    "clifford_attractor.jpg",
    width = output_width,
    height = output_height,
    pointsize = 1,
    bg = "black",
    quality = 100

)
    plot(
        cliff_points[-1, ],
        bg = "black",
        pch = ".",
        col = cliff_cols
    )

dev.off()

I've recently discovered the Scattermore package for R which is about an order of magnitude faster than R's standard plot function.我最近发现了 R 的Scattermore包,它比 R 的标准绘图函数快一个数量级。 scattermoreplot() takes ~2 minutes to plot 100m points with colour and transparency, while plot() takes around half an hour. scattermoreplot()需要大约 2 分钟来绘制具有颜色和透明度的 100m 点,而plot()需要大约半小时。

I am currently exploring datashader ( http://www.datashader.org ).我目前正在探索datashader ( http://www.datashader.org )。 If you are willing to work with python, this could be an elegant solution to the problem.如果你愿意使用 python,这可能是一个优雅的问题解决方案。

Maybe geom_hex() from the ggplo2 package can be a solution?也许来自 ggplo2 包的 geom_hex() 可以是一个解决方案? https://ggplot2.tidyverse.org/reference/geom_hex.html https://ggplot2.tidyverse.org/reference/geom_hex.html

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM