简体   繁体   English

R程序-如何避免ggplot在x轴和y轴上重新排序

[英]R program-how to avoid ggplot re-order in both x-axis and y-axis

I im trying to plot my values using ggplot but ggplot keeps reodering both my axis. 我正在尝试使用ggplot绘制我的值,但是ggplot会不断重新编排我的两个轴。 Below is a snapshot of how my input file looks like. 以下是我的输入文件外观的快照。 I have more than 50 samples. 我有50多个样本。

INPUT.txt

 Sample        Effect                Gene
TCGA-D1-A17D   stop gained           ACE
TCGA-B5-A0K4   stop gain             CBLC    
TCGA-AP-A052   frameshift variant    BRIP1

Here's my r codes to create the "heatmap" 这是我创建“热图”的r代码

library(reshape)
library(ggplot2)
all_data<- read.table(INPUT.txt", sep= "\t", header = T)
all_data.m <- melt(all_data)

#here's my attempt to try to sort the figure , but i can only sort according to just one axis

all_data.m$Gene <- factor(all_data.m$Gene, levels = all_data.m$Gene[order(all_data.m$Sample)])

cbPalette <- c("violetred", "yellowgreen", "dodgerblue3", "lightcyan4", "cyan2")
p <- ggplot(all_data.m, aes( x=Sample , y= Gene)) + geom_tile(aes(Sample, fill = Effect))+ scale_fill_manual(values=cbPalette)
p <- p + theme(axis.text.x  = element_text(angle=90, vjust=0.5, size=65, face = "bold"), axis.text.y  = element_text(size=65, face = "bold" ))
p <- p + theme(axis.ticks = element_line(size = 1))
p <- p + theme(axis.line = element_line(size = 5))
p <-  p+ theme(legend.text = element_text(size = 80, face = "bold"))
p <-  p+ theme(legend.key.size = unit(5, "cm"))
p <- p + theme(axis.title=element_text(size=80,face="bold"))
print(p)

How to create a figure according to my input file without reordering both axis 如何根据我的输入文件创建图形而无需重新排列两个轴

So my x-axis ie it needs to be TCGA-D1-A17D, TCGA-B5-A0K4, TCGA-AP-A052 in order. 因此,我的x轴即依次为TCGA-D1-A17D,TCGA-B5-A0K4,TCGA-AP-A052。

And my y-axis is ACE, CBLC, BRIP1 我的y轴是ACE,CBLC,BRIP1

It looks like you want the levels of your factor be in the order they appear in the dataset. 看起来您希望因子的水平按它们在数据集中出现的顺序排列。 You can set the level order by using the unique values of the variable in the dataset. 您可以使用数据集中变量的unique值来设置级别顺序。

Example: 例:

all_data.m$Gene = factor(all_data.m$Gene, levels = unique(all_data.m$Gene))

The new levels 新关卡

Levels: ACE CBLC BRIP1

The new forcats package makes such work even easier. 新的forcats软件包使这种工作更加容易。 The package is designed to make working with factors, including the very common task of changing the order of the levels for plotting, more straightforward. 该软件包旨在处理各种因素,包括更简单的更改绘图级别顺序的常见任务。

To order the levels in the order the appear in the dataset, use fct_inorder . 要按数据集中显示的顺序对级别进行排序,请使用fct_inorder

library(forcats)
all_data$Sample = fct_inorder(all_data$Sample)

Levels: TCGA-D1-A17D TCGA-B5-A0K4 TCGA-AP-A052

The axes of your plot will then follow the order of the factors. 然后,绘图的轴将遵循因子的顺序。

Note the y axis will start with the first level at the lower left corner and then plot in order up the y axis. 请注意,y轴将从左下角的第一层开始,然后按顺序向上绘制y轴。 If you wanted the first level, ACE , to be at the top left corner instead, you could do something like fct_inorder(rev(all_data.m$Gene)) or fct_rev(fct_inorder(all_data.m$Gene)) . 如果您希望第一级ACE位于左上角,则可以执行类似fct_inorder(rev(all_data.m$Gene))fct_rev(fct_inorder(all_data.m$Gene))

If you want to manually overwrite the order of the x-axis you should set the levels in the order you want: 如果要手动覆盖x轴的顺序,则应按所需顺序设置级别:

all_data.m$Sample <- factor(all_data.m$Sample, levels = c("TCGA-D1-A17D", "TCGA-B5-A0K4", "TCGA-AP-A052"))

If you can get the order you want by sorting you could use: 如果可以通过排序获得所需的订单,则可以使用:

all_data.m$Gene <- factor(all_data.m$Gene, levels = sort(all_data$Gene))

If you want the reverse order wrap rev() around the sort() function. 如果要反向排序,则将rev()围绕sort()函数包装。

Since you are working with strings you might also want to make sure that you start your script with options(stringsAsFactors = FALSE) to avoid non-intuitive R behavior. 由于正在使用字符串,因此您可能还需要确保使用options(stringsAsFactors = FALSE)启动脚本,以避免非直观的R行为。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM