[英]Ordering the x-axis in an R graph
我有一个data.frame看起来像:
gvs order labels
1 -2.3321916 1 Adygei
2 -1.4996229 5 Basque
3 1.7958170 15 French
4 2.5543214 19 Italian
5 -2.7758460 33 Orcadian
6 -1.9659984 39 Russian
7 2.1239768 41 Sardinian
8 -1.8515908 47 Tuscan
9 -1.5597359 6 Bedouin
10 -1.2534511 14 Druze
11 -0.1625003 31 Mozabite
12 -1.0265275 35 Palestinian
13 -0.8519079 2 Balochi
14 -2.4279528 8 Brahui
15 -3.1717421 9 Burusho
16 -0.9258497 17 Hazara
17 -1.2207974 21 Kalash
18 -1.0325107 24 Makrani
19 -3.2102686 37 Pathan
20 -0.9377928 43 Sindhi
21 -1.7657017 48 Uygurf
22 -0.5058627 10 Cambodian
23 -0.7819299 12 Dai
24 -1.4095947 13 Daur
25 2.2810477 16 Han
26 -0.9007551 18 Hezhen
27 2.6614486 20 Japanese
28 -0.9441980 23 Lahu
29 -0.7237586 29 Miao
30 -0.9452944 30 Mongola
31 -1.2035258 32 Naxi
32 -0.7703779 34 Oroqen
33 -3.0895998 42 She
34 -0.7037952 45 Tu
35 -1.9311354 46 Tujia
36 -0.5423822 49 Xibo
37 -1.6244801 50 Yakut
38 -0.9049735 51 Yi
39 -2.6491331 11 Colombian
40 2.3706977 22 Karitiana
41 -2.7590587 26 Maya
42 -0.9614190 38 Pima
43 -1.6961014 44 Surui
44 -0.8449225 28 Melanesian
45 -1.1163019 36 Papuan
46 -0.9298674 3 BantuKenya
47 -2.8859587 4 BantuSouthAfrica
48 -1.4494841 7 BiakaPygmy
49 -0.7381369 25 Mandenka
50 -0.5644325 27 MbutiPygmy
51 -0.9195156 40 San
52 2.0949378 52 Yoruba
我想图表列gvs
沿着列的排列顺序x轴order
,然后具有用于沿x轴的各点的标签是从柱labels
。 有谁知道这是怎么做的? 我希望该图看起来像本文中图5中图的彩色版本一样, http://journals.plos.org/plosgenetics/article?id=10.1371/journal.pgen.1004412
根据您的评论,看起来(1) labels
与gvs
和order
不对应,以及(2)如果我按order
对前两列进行order
,则数据框将正确排序。 如果这不正确,请告诉我。
按顺序对前两列进行order
,仅剩下第三列:
df[,c("gvs","order")] = df[order(df$order), c("gvs","order")]
设置的顺序labels
基于当前排序labels
的样本数据帧:
df$labels = factor(df$labels, levels=df$labels)
为区域添加分组变量。 我通过每次labels
的字母顺序“向后”创建一个新组来做到这一点。 这些区域在这里只是数字,但是如果要使用它们,可以给它们提供描述性名称:
df$group = c(0, cumsum(diff(match(substr(df$labels,1,1), LETTERS)) < 0))
添加伪p值(因为点大小基于链接到的图中的p值):
set.seed(595)
df$p.value = runif(nrow(df), 0, 0.5)
绘制数据,包括每个区域组的不同颜色,基于p值的点大小以及p <0.05的点周围的黑色边框。 geom_line
添加区域均值:
library(dplyr)
ggplot(df, aes(labels, gvs, size=p.value, fill=factor(group))) +
geom_line(data=df %>% group_by(group) %>% mutate(gvs=mean(gvs)),
aes(group=group, colour=factor(group)), size=0.8,alpha=0.5) +
geom_point(pch=21, stroke=1, aes(color=p.value<0.05)) +
theme_bw() +
theme(axis.text.x=element_text(angle=-90, hjust=0, vjust=0.5),
panel.grid.major=element_blank(),
panel.grid.minor=element_blank()) +
scale_size_continuous(name="p values", limits=c(0, 0.5), breaks=seq(0,1,0.1), range=c(4,1)) +
scale_color_manual(values=c(hcl(seq(15,375,length.out=8),100,65)[1:7],NA,"black")) +
labs(x="Language", fill="Region") +
guides(colour=FALSE,
size=guide_legend(reverse=TRUE, override.aes=list(color=NA,fill="grey50")),
fill=guide_legend(reverse=TRUE, override.aes=list(color=NA, size=3)))
读取数据帧:
df <- data.frame(gvs = c(-2.3321916, -1.4996229, 1.795817, 2.5543214, -2.775846, -1.9659984,
2.1239768, -1.8515908, -1.5597359, -1.2534511, -0.1625003, -1.0265275,
-0.8519079, -2.4279528, -3.1717421, -0.9258497, -1.2207974, -1.0325107,
-3.2102686, -0.9377928, -1.7657017, -0.5058627, -0.7819299, -1.4095947,
2.2810477, -0.9007551, 2.6614486, -0.944198, -0.7237586, -0.9452944,
-1.2035258, -0.7703779, -3.0895998, -0.7037952, -1.9311354, -0.5423822,
-1.6244801, -0.9049735, -2.6491331, 2.3706977, -2.7590587, -0.961419,
-1.6961014, -0.8449225, -1.1163019, -0.9298674, -2.8859587, -1.4494841,
-0.7381369, -0.5644325, -0.9195156, 2.0949378),
order = c(1L, 5L, 15L, 19L, 33L, 39L, 41L, 47L, 6L, 14L, 31L, 35L, 2L,
8L, 9L, 17L, 21L, 24L, 37L, 43L, 48L, 10L, 12L, 13L, 16L, 18L,
20L, 23L, 29L, 30L, 32L, 34L, 42L, 45L, 46L, 49L, 50L, 51L, 11L,
22L, 26L, 38L, 44L, 28L, 36L, 3L, 4L, 7L, 25L, 27L, 40L, 52L),
labels = c("Adygei", "Basque", "French", "Italian", "Orcadian", "Russian",
"Sardinian", "Tuscan", "Bedouin", "Druze", "Mozabite", "Palestinian",
"Balochi", "Brahui", "Burusho", "Hazara", "Kalash", "Makrani",
"Pathan", "Sindhi", "Uygurf", "Cambodian", "Dai", "Daur", "Han",
"Hezhen", "Japanese", "Lahu", "Miao", "Mongola", "Naxi", "Oroqen",
"She", "Tu", "Tujia", "Xibo", "Yakut", "Yi", "Colombian", "Karitiana",
"Maya", "Pima", "Surui", "Melanesian", "Papuan", "BantuKenya",
"BantuSouthAfrica", "BiakaPygmy", "Mandenka", "MbutiPygmy", "San",
"Yoruba"))
订单资料
df.ordered <- df[ order(df$order) , ]
还有一些简单的(难看的)样本绘图,您肯定可以改进(也许使用ggplot)
plot(df.ordered$gvs, pch = 19)
axis(1, at=1:52, labels=df.ordered$labels, las=2)
不依赖于数据框排序的另一个选项是使用离散比例尺的limits
参数(作为附带的好处,它可以让您在绘制图形时做更多的任意排序)。
df <-read.csv(/path/to/file/df.csv')
xorder <-df[order(df$order),'labels']
ggplot(df, aes(x=labels, y=gvs, size=gvs)) +
geom_point() +
scale_x_discrete(limits=xorder)+
theme(axis.text.x=element_text(angle=90))
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.