简体   繁体   English

R图:按组标记

[英]R graph: label by group

The data I am working on is a clustering data, with multiple observations within one group, I generated a caterpillar plot and want labelling for each group(zipid), not every line, my current graph and code look like this: 我正在处理的数据是一个聚类数据,在一个组内有多个观察值,我生成了一个毛毛虫图,并希望为每个组(zipid)而不是每条线加标签,我的当前图形和代码如下所示:

  text = hosp_new[,c("zipid")]
  ggplot(hosp_new, aes(x = id, y = oe, colour = zipid, shape = group)) +
  # theme(panel.grid.major = element_blank()) +
  geom_point(size=1) +
  scale_shape_manual(values = c(1, 2, 4)) +
  geom_errorbar(aes(ymin = low_ci, ymax = high_ci)) +
  geom_smooth(method = lm, se = FALSE) +
  scale_linetype_manual(values = linetype) +
  geom_segment(aes(x = start_id, xend = end_id, y = region_oe, yend = region_oe, linetype = "4", size = 1.2)) +
  geom_ribbon(aes(ymin = region_low_ci, ymax = region_high_ci), alpha=0.2, linetype = "blank") +
  geom_hline(aes(yintercept = 1, alpha = 0.2, colour = "red", size = 1), show.legend = "FALSE") +
  scale_size_identity() +
  scale_x_continuous(name = "hospital id", breaks = seq(0,210, by = 10)) +
  scale_y_continuous(name = "O:E ratio", breaks = seq(0,7, by = 1)) +
  geom_text(aes(label = text), position = position_stack(vjust = 10.0), size = 2)

Caterpillar plot: 卡特彼勒剧情:

毛毛虫情节

Each color represents a region, I just want one label/per region, but don't know how to delete the duplicated labels in this graph. 每种颜色代表一个区域,我只想要一个标签/每个区域,但不知道如何删除此图中的重复标签。 Any idea? 任何想法?

The key is to have geom_text return only one value for each zipid , rather than multiple values. 关键是让geom_text为每个zipid仅返回一个值,而不是多个值。 If we want each zipid label located in the middle of its group, then we can use the average value of id as the x-coordinate for each label. 如果我们希望每个zipid标签位于其组的中间,则可以使用id的平均值作为每个标签的x坐标。 In the code below, we use stat_summaryh (from the ggstance package) to calculate that average id value for the x-coordinate of the label and return a single label for each zipid . 在下面的代码中,我们使用stat_summaryh (来自ggstance包)来计算标签x坐标的平均id值,并为每个zipid返回单个标签。

library(ggplot2)
theme_set(theme_bw())
library(ggstance)

# Fake data
set.seed(300)
dat = data.frame(id=1:100, y=cumsum(rnorm(100)), 
                 zipid=rep(LETTERS[1:10], c(10, 5, 20, 8, 7, 12, 7, 10, 13,8)))

ggplot(dat, aes(id, y, colour=zipid)) +
  geom_segment(aes(xend=id, yend=0)) +
  stat_summaryh(fun.x=mean, aes(label=zipid, y=1.02*max(y)), geom="text") +
  guides(colour=FALSE)

在此处输入图片说明

You could also use faceting, as mentioned by @user20650. 您也可以使用构面,如@ user20650所述。 In the code below, panel.spacing.x=unit(0,'pt') removes the space between facet panels, while expand=c(0,0.5) adds 0.5 units of padding on the sides of each panel. 在下面的代码中, panel.spacing.x=unit(0,'pt')删除了小平面面板之间的空间,而expand=c(0,0.5)在每个面板的侧面添加了0.5个填充单位。 Together, these ensure constant spacing between tick marks, even across facets. 在一起,它们可以确保刻度线之间,甚至各个小面之间的间距恒定。

ggplot(dat, aes(id, y, colour=zipid)) +
  geom_segment(aes(xend=id, yend=0)) +
  facet_grid(. ~ zipid, scales="free_x", space="free_x") +
  guides(colour=FALSE) +
  theme_classic() +
  scale_x_continuous(breaks=0:nrow(dat), 
                     labels=c(rbind(seq(0,100,5),'','','',''))[1:(nrow(dat)+1)], 
                     expand=c(0,0.5)) +
  theme(panel.spacing.x = unit(0,"pt")) 

在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM