简体   繁体   English

如何在 R 中订购盒子 plot 或 QQ plot 中的 X 轴?

[英]how to order the X-axis in a box plot or QQ plot in R?

here is a reproducible sample:这是一个可重复的样本:

library(ggpubr)
library(rstatix)
library(tibble)

da.ma <-matrix(1:22000, 10, 22) ## a sample matrix

n <-seq(max(length(da.ma[1,]))) ## naming cols and rows
for (i in n) {
    c.names <- paste("k", n, sep = "")
}
colnames(da.ma) <- c.names 

n.pdf <-seq(length(da.ma[,1]))
for (i in n.pdf) {
    r.names <- paste("text",n.pdf, sep ="")
}
rownames(da.ma) <- r.names
col.names <-names(da.ma[1, ])

da.ma <-cbind(id = seq(length(da.ma[, 1])), da.ma) ##adding the id col
data <- as_tibble(da.ma)

in.anova <- data %>%
  gather(key = "Length", value = "TTR", colnames(data[, 2:23])) %>%
  convert_as_factor(id, Length)

Up to here, you create the data, but when you draw the plot, the X-axis is not in the right order:至此,您创建了数据,但是当您绘制 plot 时,X 轴的顺序不正确:

ggboxplot(in.anova, x = "Length", y = "TTR", add = "point")

I need it to start from k1 and go up to k24 .我需要它从k1和 go 开始直到k24 However, it starts from k1 and continues with k10 , k11 , k12 , etc. The right order on the X-axis would be: k1 , k2 , k3 , k4 , ..., k23 , and k24 .但是,它从k1开始并继续 k10 、 k11 k12 k10等。X 轴上的正确顺序是: k1k2k3k4 、 ...、 k23k24

in.anova$Length <- factor(in.anova$Length, levels = paste0("k", 1:22))  
ggboxplot(in.anova, x = "Length", y = "TTR", add = "point")

Returns:回报:

在此处输入图像描述

You can use factor() function with predefined order as levels for Length column.您可以使用具有预定义顺序的factor() function 作为 Length 列的级别。

library(rstatix)
library(ggpubr)
da.ma <-matrix(1:22000, 10, 22) ## a sample matrix

n <-seq(max(length(da.ma[1,]))) ## naming cols and rows
for (i in n) {
    c.names <- paste("k", n, sep = "")
}
colnames(da.ma) <- c.names 

n.pdf <-seq(length(da.ma[,1]))
for (i in n.pdf) {
    r.names <- paste("text",n.pdf, sep ="")
}
rownames(da.ma) <- r.names
col.names <-names(da.ma[1,])

da.ma <-cbind(id =seq(length(da.ma[,1])), da.ma) ##adding the id col
library(tibble)
data <- as_tibble(da.ma)

in.anova <- data %>%
  gather(key = "Length", value = "TTR", colnames(data[,2:23])) %>%
  convert_as_factor(id, Length)
 

#get unique length values             
levels = unique(in.anova$Length)

#order last two digits
levels = levels[order(as.numeric(substr(levels,2,4)))]

#change length column type as factor with predefined order previously
in.anova$Length = factor(in.anova$Length,levels=levels )

ggboxplot(in.anova, x = "Length", y = "TTR", add = "point")

Your X-axis is in order , however it is in alphabethical order .您的 X 轴是按顺序排列的,但它是按字母顺序排列的。 If you run in your console 'k2' > 'k11' you will see what I mean.如果您在控制台'k2' > 'k11'中运行,您会明白我的意思。

Next, to your reproducible example.接下来,到您的可重现示例。

Sample matrix样品矩阵

  • I would avoid dot notation because such names looks like a base functions, and it is confusing;我会避免使用点表示法,因为这样的名称看起来像一个基本函数,而且很容易混淆;
  • I would advise to use space between variable name and assignment operator - it is more readable;我建议在变量名和赋值运算符之间使用空格- 它更具可读性;
  • As your data argument you provide a vector of length 22000 ( 1:22000 ).作为您的data参数,您提供一个长度为 22000 ( 1:22000 ) 的向量。 At the same time you tell to the matrix() that you want a matrix with 10 rows and 22 columns.同时你告诉matrix()你想要一个 10 行 22 列的矩阵。 Since 10 x 22 = 220 only first 220 of 22000 values will be used, the rest will be ignored;由于10 x 22 = 220仅使用 22000 个值中的前 220 个,因此 rest 将被忽略;
  • You can use set.seed() and eg sample() functions to generate random data;您可以使用set.seed()和例如sample()函数来生成随机数据;

Finally your sample matrix generation would look like this:最后,您的样本矩阵生成将如下所示:

set.seed(67600941)

mtrx <- matrix(sample(220), 10)

Rownames, colnames行名,列名

  • In most of the cases, you don't need to loop in R.在大多数情况下,您不需要在 R 中循环。 Most of the functions in R are vectorized; R中的大部分函数都是向量化的;
  • You don't need to save the names into separate variable, you can assign them directly;您不需要将名称保存到单独的变量中,您可以直接分配它们;
  • For columns being in order, I would use a leading zeros, which is easy to achieve with sprintf() function;对于有序的列,我会使用前导零,这很容易通过sprintf() function 实现;
  • You do not use row names further, but I'll leave it as it is;您不再使用行名,但我会保持原样;

Final code:最终代码:

rownames(mtrx) <- sprintf('text%02d', seq(nrow(mtrx)))
colnames(mtrx) <- sprintf('k%02d',    seq(ncol(mtrx)))

mtrx[1:5, 1:5]

#        k01 k02 k03 k04 k05
# text01 206 127   9   1 138
# text02 191  46 220  59  73
# text03 145  15 148 213 103
# text04  80 115 211  62  79
# text05  28  11 195 136  84

Data preprocessing and plotting数据预处理和绘图

  • You can use rowid_to_column() from tibble to create your id column during preprocessing;您可以使用 tibble 中的tibble rowid_to_column()在预处理期间创建您的id列;
  • I would use pivot_longer() intead of gather() since the gather() is depricated;我会使用pivot_longer()而不是gather() ,因为gather()已被贬低;
  • You don't need to care about the levels, since leading zeros put your Length in right alphabetical order;您不需要关心级别,因为前导零将您的Length按字母顺序排列;
  • I didn't save the transformed data into intermediate variable, just to save the space.我没有将转换后的数据保存到中间变量中,只是为了节省空间。
library(tidyverse)

mtrx %>%
    as_tibble() %>%
    rowid_to_column('id') %>%
    pivot_longer(-id, names_to = 'Length', values_to = 'TTR') %>%
    mutate(Length = factor(Length)) %>%
    ggplot(aes(x = Length, y = TTR)) +
      geom_jitter() +
      geom_boxplot(fill = NA) +
      ggthemes::theme_few()

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM