简体   繁体   English

R - 差分散点图

[英]R - difference scatter plot

I was wondering if there is a way to subtract two binned scatter plots from one another in R. I have two distributions with the same axes and want to overlay one on top of the other and subtract them hence producing a difference scatter plot. 我想知道是否有办法在R中相互减去两个分箱散点图。我有两个具有相同轴的分布,并希望将一个叠加在另一个上面并减去它们,从而产生差异散点图。

Here are my two plots: 这是我的两个情节:

在此输入图像描述 在此输入图像描述

and my script for the plots: 和我的剧情脚本:

library(hexbin)
library(RColorBrewer)

setwd("/Users/home/")
df <- read.table("data1.txt")
x <-df$c2
y <-df$c3

bin <-hexbin(x,y,xbins=2000)
my_colors=colorRampPalette(rev(brewer.pal(11,'Spectral')))
d <- plot(bin, main=""  , colramp=my_colors, legend=F)

Any advice on how to go about this would be very helpful. 关于如何解决这个问题的任何建议都会非常有帮助。

EDIT Found an additional way to do this: 编辑找到另一种方法:

xbnds <- range(x1,x2)
ybnds <- range(y1,y2)
bin1 <- hexbin(x1,y1,xbins= 200, xbnds=xbnds,ybnds=ybnds)
bin2 <- hexbin(x2,y2,xbins= 200, xbnds=xbnds,ybnds=ybnds)
erodebin1 <- erode.hexbin(smooth.hexbin(bin1))
erodebin2 <- erode.hexbin(smooth.hexbin(bin2))
hdiffplot(erodebin1, erodebin2)

Alright, as a starting point, here is some sample data. 好吧,作为一个起点,这里有一些样本数据。 Each is random, with one shifted to (2,2). 每个都是随机的,一个转移到(2,2)。

df1  <-
  data.frame(
    x = rnorm(1000)
    , y = rnorm(1000)
  )

df2  <-
  data.frame(
    x = rnorm(1000, 2)
    , y = rnorm(1000, 2)
  )

To ensure that the bins are identical, it is best to construct one hexbin object. 为确保箱子相同,最好构造一个hexbin对象。 To accomplish this, I am using dplyr 's bind_rows to keep a track of which data.frame the data came from (this would be even easier if you had a single data.frame with a grouping variable). 为了实现这一点,我使用dplyrbind_rows来跟踪数据来自哪个data.frame(如果你有一个带有分组变量的data.frame,这会更容易)。

bothDF <-
  bind_rows(A = df1, B = df2, .id = "df")


bothHex <-
  hexbin(x = bothDF$x
         , y = bothDF$y
         , IDs = TRUE
         )

Next, we are using a mix of hexbin and dplyr to count the occurrences of each within each cell. 接下来,我们使用hexbindplyr的混合来计算每个单元格中每个的出现次数。 First, apply across the bins, constructing a table (needs to use factor to make sure all levels are shown; not needed if your column is already a factor). 首先,应用于垃圾箱,构建一个表(需要使用factor来确保显示所有级别;如果列已经是一个因素,则不需要)。 Then, it simplifies it and constructs a data.frame that is then manipluated with mutate to calculate the difference in counts and then joined back to a table that gives the x and y values for each of the id's. 然后,它简化了它并构造了一个data.frame,然后用mutate来计算计数差异,然后连接回一个表,给出每个id的x和y值。

counts <-
  hexTapply(bothHex, factor(bothDF$df), table) %>%
  simplify2array %>%
  t %>%
  data.frame() %>%
  mutate(id = as.numeric(row.names(.))
         , diff = A - B) %>%
  left_join(data.frame(id = bothHex@cell, hcell2xy(bothHex)))

head(counts) gives: head(counts)给出:

  A B  id diff          x         y
1 1 0   7    1 -1.3794467 -3.687014
2 1 0  71    1 -0.8149939 -3.178209
3 1 0  79    1  1.4428172 -3.178209
4 1 0  99    1 -1.5205599 -2.923806
5 2 0 105    2  0.1727985 -2.923806
6 1 0 107    1  0.7372513 -2.923806

Finally, we use ggplot2 to plot the resulting data, as it offers more control (and the ability to more easily use a different variable than count as fills) than hexbin itself. 最后,我们使用ggplot2来绘制结果数据,因为它提供了比hexbin本身更多的控制(以及更容易使用不同于变量的变量的能力)。

counts %>%
  ggplot(aes(x = x, y = y
             , fill = diff)) +
  geom_hex(stat = "identity") +
  coord_equal() +
  scale_fill_gradient2()

在此输入图像描述

From there, it is easy to play around with axes, colors, etc. 从那里,很容易玩斧头,颜色等。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM