簡體   English   中英

有人可以解釋這些代碼行的含義嗎?

[英]Can someone explain what these lines of code mean?

我一直在嘗試找到一種方法來制作具有顏色強度的散點圖,該散點圖表示該區域中繪制的點的密度(這是一個有很多重疊的大數據集)。 我發現這些代碼行允許我這樣做,但我想確保我真的理解每一行實際上在做什么。 提前致謝 :)

get_density <- function(x, y, ...){
  dens <- MASS::kde2d(x, y, ...)
  ix <- findInterval(x, dens$x)
  iy <- findInterval(y, dens$y)
  ii <- cbind(ix, iy)
  return(dens$z[ii])
}

set.seed(1)

dat <- data.frame(x = subset2$conservation.phyloP, y = subset2$gene.expression.RPKM)
dat$density <- get_density(dat$x, dat$y, n = 100)  

下面是帶有一些解釋性注釋的函數,如果還有什么不明白的,請告訴我:

# The function "get_density" takes two arguments, called x and y
# The "..." allows you to pass other arguments 
get_density <- function(x, y, ...){ 

# The "MASS::" means it comes from the MASS package, but makes it so you don't have to load the whole MASS package and can just pull out this one function to use. 
# This is where the arguments passed as "..." (above) would get passed along to the kde2d function
dens <- MASS::kde2d(x, y, ...)
# These lines use the base R function "findInterval" to get the density values of x and y
ix <- findInterval(x, dens$x)
iy <- findInterval(y, dens$y)
# This command "cbind" pastes the two sets of values together, each as one column
ii <- cbind(ix, iy)
# This line takes a subset of the "density" output, subsetted by the intervals above
return(dens$z[ii])
}  

# The "set.seed()" function makes sure that any randomness used by a function is the same if it is re-run (as long as the same number is used), so it makes code more reproducible
set.seed(1)

dat <- data.frame(x = subset2$conservation.phyloP, y = subset2$gene.expression.RPKM)
dat$density <- get_density(dat$x, dat$y, n = 100)  

如果您的問題與MASS::kde2d函數本身有關,最好重寫此 StackOverflow 問題以反映這一點!

看起來相同的函數被包裝到這里描述的ggplot2方法中,因此如果您切換到使用ggplot2制作繪圖,您可以嘗試一下。

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM