[英]How can I get stack bar plot for list of data.frame where keeping or removing duplicated rows?
I have list of data.frame that needed to be categorized by threshold, finally getting stack bar plot by different category for file bar is desired. 我有需要按阈值分类的data.frame列表,最后需要按文件栏的不同类别获取堆栈栏图。 However, in my data.frame list, some rows are duplicated, and I need to show these duplicated rows in certain plot, but also these duplicated rows should be removed and displayed another plot.
但是,在我的data.frame列表中,有些行是重复的,因此我需要在某些图中显示这些重复的行,但是这些重复的行也应该删除并显示另一个图。 Because, keeping, removing these duplicated rows in different category, could give different insight to understand the result.
因为,保留,删除这些不同类别中的重复行,可能会带来不同的见解以了解结果。 Based on the name of stack bar plot, I intend to keep and remove these duplicated rows in certain category.
基于堆栈条形图的名称,我打算保留并删除某些类别中的这些重复行。 I have bit of hard time to get expected plot as I desired.
我很难获得期望的情节。 Can any one point me how to make this happen easily ?
谁能指出我如何轻松实现这一目标? How can I prepare plot data to get desired plot for my needs ?
如何准备样地数据以获得所需的样地? Any idea ?
任何想法 ?
reproducible data.frame : 可复制的data.frame:
Qualified <- list(
hotan = data.frame( begin=c(7,13,19,25,31,37,43,49,55,67,79,103,31,49,55,67),
end= c(10,16,22,28,34,40,46,52,58,70,82,106,34,52,58,70),
pos.score=c(11,19,8,2,6,14,25,10,23,28,15,17,6,10,23,28)),
aksu = data.frame( begin=c(12,21,30,39,48,57,66,84,111,30,48,66,84),
end= c(15,24,33,42,51,60,69,87,114,33,51,69,87),
pos.score=c(5,11,15,23,9,13,2,10,16,15,9,2,10)),
korla = data.frame( begin=c(6,14,22,30,38,46,54,62,70,78,6,30,46,70),
end=c(11,19,27,35,43,51,59,67,75,83,11,35,51,75),
pos.score=c(9,16,12,3,20,7,11,13,14,17,9,3,7,14))
)
unQualified <- list(
hotan = data.frame( begin=c(21,33,57,69,81,117,129,177,225,249,333,345,33,81,333),
end= c(26,38,62,74,86,122,134,182,230,254,338,350,38,86,338),
pos.score=c(7,34,29,14,23,20,11,30,19,17,6,4,34,23,6)),
aksu = data.frame( begin=c(13,23,33,43,53,63,73,93,113,123,143,153,183,33,63,143),
end= c(19,29,39,49,59,69,79,99,119,129,149,159,189,39,69,149),
pos.score=c(5,13,32,28,9,11,22,12,23,3,6,8,16,32,11,6)),
korla = data.frame( begin=c(23,34,45,56,67,78,89,122,133,144,166,188,56,89,144),
end=c(31,42,53,64,75,86,97,130,141,152,174,196,64,97,152),
pos.score=c(3,10,19,17,21,8,18,14,4,9,12,22,17,18,9))
)
Edit : 编辑 :
I did categorize my data in this way : 我确实以这种方式对数据进行了分类:
singleDF <-
bind_rows(c(Qualified = Qualified, Unqualified = unQualified), .id = "id") %>%
tidyr::separate(id, c("group", "list")) %>%
mutate(elm = ifelse(pos.score >= 10, "valid", "invalid")) %>%
arrange(list, group, desc(elm))
res <- singleDF %>% split(list(.$list, .$elm, .$group))
This is my desired plot: 这是我想要的情节:
Note that in valid
, invalid
category, I need duplicate removal for data.frame, while Qualified
, UnQualified
category, I'll keep these repeated rows. 请注意,在
valid
, invalid
类别中,我需要对data.frame进行重复删除,而在Qualified
, UnQualified
类别中,我将保留这些重复的行。
How can I achieve my desired plot ? 如何获得理想的情节? How can I make this happen by using
ggplot2
package ? 如何通过使用
ggplot2
软件包来实现此ggplot2
? Any idea please ? 有什么想法吗? Thanks in advance :)
提前致谢 :)
Something like this perhaps?: 也许是这样的:
library(tidyverse)
library(cowplot)
theme_set(theme_grey())
p1 <- ggplot(filter(singleDF, list == "aksu"),
aes(group, fill = elm)) +
geom_bar() +
ylim(0, 16) +
theme(legend.position = 'top', legend.title = element_blank(), axis.title.x = element_blank())
p2 <- ggplot(filter(singleDF, list == "aksu") %>% distinct(),
aes(elm, fill = group)) +
geom_bar() +
scale_fill_discrete(h.start = 90) +
ylim(0, 16) +
theme(legend.position = 'top', legend.title = element_blank(), axis.title.x = element_blank())
plot_grid(p1, p2, align = 'v', nrow = 1)
If you want to do this for each element of a list, you can use the tidyverse
packages and wrap @Axeman's answer into a function. 如果要对列表的每个元素执行此操作,则可以使用
tidyverse
包并将tidyverse
的答案包装到函数中。 I modified @Axeman's code to get the appearance that you wish, although I don't use cowplot
so I substituted gridExtra
. 我修改了@Axeman的代码来获得所需的外观,尽管我不使用
cowplot
所以我替换了gridExtra
。
EDIT: Easy fix to get your desired plot, just simply grid.arrange
the results of the map
with a single row. 编辑:轻松修复即可获得所需的绘图,只需简单地将
grid.arrange
the map
的结果单行排列即可。 I also tweaked the plot to align more with your desired output. 我还调整了情节,使其与您所需的输出更加一致。 I used
geom_label
to get the counts, with stat="count"
and use of the ..count..
special variable. 我使用
geom_label
来获取计数,使用stat="count"
并使用..count..
特殊变量。 You can switch it for geom_text
if you wish. 您可以根据需要将其切换为
geom_text
。
library(tidyverse)
library(grid) #for grid.draw
library(gridExtra) #for grid.arrange
split_plot <- function(x) {
p1 <- ggplot(x, aes(x = group)) +
geom_bar(aes(fill = elm), color = "black") +
geom_label(aes(label = ..count.., color = elm), stat = "count", position = position_stack()) +
ylim(0, 16) +
labs(y = NULL, x = NULL) +
theme_minimal() +
theme(legend.position = 'none',
panel.grid = element_blank(),
legend.title = element_blank(),
axis.ticks.y = element_blank(),
axis.text.y = element_blank())
p2 <- ggplot(distinct(x), aes(x = elm)) +
geom_bar(aes(fill = group), color = "black") +
geom_label(aes(label = ..count.., color = group), stat = "count", position = position_stack()) +
scale_fill_discrete(h.start = 90) +
scale_color_discrete(h.start = 90) +
labs(y = NULL, x = NULL) +
ylim(0, 16) +
theme_minimal() +
theme(legend.position = 'none',
panel.grid = element_blank(),
legend.title = element_blank(),
axis.ticks.y = element_blank(),
axis.text.y = element_blank())
arrangeGrob(p1, p2, nrow = 1, top = unique(x$list))
}
# Call the function over `singleDF`, split by list and plot each
res <- singleDF %>%
split(.$list) %>%
map(~split_plot(.x))
# Use grid.arange to draw the grobs
grid.arrange(grobs = res, nrow = 1)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.