簡體   English   中英

更改圖表。相關默認為生成最佳擬合線而不是下三角形中的平滑曲線[R]

[英]change chart.Correlation defaults to produce best fit line rather than smoothed curve in lower triangle [R]

我試圖創建一個由16個不同向量組成的相關矩陣,幾乎所有東西都是我想要的,唯一的區別是我寧願在散點圖中使用最佳擬合線而不是平滑曲線。 我已經看到一些其他的帖子提到使用函數調用tables.Correlation的pairs()部分來改變pch,是否有類似的東西要求最佳擬合線而不是平滑曲線?

我問,因為我覺得平滑的曲線可能會在散點圖的某些部分給出錯誤的高相關性,我知道上半部分的相關性就在那里,但我仍然希望有更改線的選項在從平滑到最佳擬合的散點圖中。

我的代碼非常簡單:

chart.Correlation(all.cell.types.rna.seq.table[,2:16], histogram=FALSE)

all.cell.types.rna.seq.table是一個包含16列的數據幀,第一列是id號。

相關矩陣,平滑線而不是最佳擬合線:

相關矩陣,平滑的線條而不是最佳擬合線條

我想要的是在相關矩陣圖像的下三角形上的散點圖中最佳擬合線而不是平滑曲線。

我正在尋找完全一樣的......而且可以只使用這里所示的基本函數pairs() 以下是使用數據集mtcars的示例:

reg <- function(x, y, ...) {
  points(x,y, ...)
  abline(lm(y~x), col = "red") 
}

panel.cor <- function(x, y, digits = 2, prefix = "", cex.cor, ...) {
  usr <- par("usr"); on.exit(par(usr))
  par(usr = c(0, 1, 0, 1))
  r <- cor(x, y) # was abs(cor(x, y))
  txt <- format(c(r, 0.123456789), digits = digits)[1]
  txt <- paste0(prefix, txt)
  if(missing(cex.cor)) cex.cor <- 2 # or 0.8/strwidth(txt)
  text(0.5, 0.5, txt, cex = cex.cor) # was cex.cor * abs(r))
}

panel.hist <- function(x, ...) {
  usr <- par("usr"); on.exit(par(usr))
  par(usr = c(usr[1:2], 0, 1.5) )
  h <- hist(x, plot = FALSE)
  breaks <- h$breaks; nB <- length(breaks)
  y <- h$counts; y <- y/max(y)
  rect(breaks[-nB], 0, breaks[-1], y, col = "cyan", ...)
}

pairs(mtcars[, c(1,3,4,5,6,7)], lower.panel = reg, upper.panel = panel.cor, diag.panel = panel.hist)

在mtcars上使用<code> pairs()</ code>的結果

但它不如chart.Correlation()那么好,因為我無法弄清楚,如何將reg函數傳遞給chart.Correlation() ,我查看了它的代碼並通過簡單地改變它來弄清楚它直接在函數內部: lower.panel = panel.smooth ==> lower.panel = reg 所以這是mtcars的最后一個例子:

chart.Correlation.linear <-
  function (R, histogram = TRUE, method=c("pearson", "kendall", "spearman"), ...)
  { # @author R Development Core Team
    # @author modified by Peter Carl & Marek Lahoda
    # Visualization of a Correlation Matrix. On top the (absolute) value of the correlation plus the result 
    # of the cor.test as stars. On botttom, the bivariate scatterplots, with a linear regression fit. 
    # On diagonal, the histograms with probability, density and normal density (gaussian) distribution.

    x = checkData(R, method="matrix")

    if(missing(method)) method=method[1] #only use one
    cormeth <- method

    # Published at http://addictedtor.free.fr/graphiques/sources/source_137.R
    panel.cor <- function(x, y, digits=2, prefix="", use="pairwise.complete.obs", method=cormeth, cex.cor, ...)
    {
      usr <- par("usr"); on.exit(par(usr))
      par(usr = c(0, 1, 0, 1))
      r <- cor(x, y, use=use, method=method) # MG: remove abs here
      txt <- format(c(r, 0.123456789), digits=digits)[1]
      txt <- paste(prefix, txt, sep="")
      if(missing(cex.cor)) cex <- 0.8/strwidth(txt)

      test <- cor.test(as.numeric(x),as.numeric(y), method=method)
      # borrowed from printCoefmat
      Signif <- symnum(test$p.value, corr = FALSE, na = FALSE,
                       cutpoints = c(0, 0.001, 0.01, 0.05, 0.1, 1),
                       symbols = c("***", "**", "*", ".", " "))
      # MG: add abs here and also include a 30% buffer for small numbers
      text(0.5, 0.5, txt, cex = cex * (abs(r) + .3) / 1.3)
      text(.8, .8, Signif, cex=cex, col=2)
    }

    #remove method from dotargs
    dotargs <- list(...)
    dotargs$method <- NULL
    rm(method)

    hist.panel = function (x, ...=NULL ) {
      par(new = TRUE)
      hist(x,
           col = "light gray",
           probability = TRUE,
           axes = FALSE,
           main = "",
           breaks = "FD")
      lines(density(x, na.rm=TRUE),
            col = "red",
            lwd = 1)
      # adding line representing density of normal distribution with parameters correponding to estimates of mean and standard deviation from the data 
      ax.x = seq(min(x), max(x), 0.1)                                                  # ax.x containts points corresponding to data range on x axis
      density.est = dnorm(ax.x, mean = mean(x), sd = sd(x))   # density corresponding to points stored in vector ax.x 
      lines(ax.x, density.est, col = "blue", lwd = 1, lty = 1)                                # adding line representing density into histogram
      rug(x)
    }

    # Linear regression line fit over points
    reg <- function(x, y, ...) {
      points(x,y, ...)
      abline(lm(y~x), col = "red") 
    }

    # Draw the chart
    if(histogram)
      pairs(x, gap=0, lower.panel=reg, upper.panel=panel.cor, diag.panel=hist.panel)
    else
      pairs(x, gap=0, lower.panel=reg, upper.panel=panel.cor) 
  }

chart.Correlation.linear(mtcars[, c(1,3,4,5,6,7)], histogram = TRUE)

使用修改圖表的結果。在mtcars上的相關性

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM