如何获取保留或删除重复行的data.frame列表的堆栈条图？

Question

I have list of data.frame that needed to be categorized by threshold, finally getting stack bar plot by different category for file bar is desired. 我有需要按阈值分类的data.frame列表，最后需要按文件栏的不同类别获取堆栈栏图。 However, in my data.frame list, some rows are duplicated, and I need to show these duplicated rows in certain plot, but also these duplicated rows should be removed and displayed another plot. 但是，在我的data.frame列表中，有些行是重复的，因此我需要在某些图中显示这些重复的行，但是这些重复的行也应该删除并显示另一个图。 Because, keeping, removing these duplicated rows in different category, could give different insight to understand the result. 因为，保留，删除这些不同类别中的重复行，可能会带来不同的见解以了解结果。 Based on the name of stack bar plot, I intend to keep and remove these duplicated rows in certain category. 基于堆栈条形图的名称，我打算保留并删除某些类别中的这些重复行。 I have bit of hard time to get expected plot as I desired. 我很难获得期望的情节。 Can any one point me how to make this happen easily ? 谁能指出我如何轻松实现这一目标？ How can I prepare plot data to get desired plot for my needs ? 如何准备样地数据以获得所需的样地？ Any idea ? 任何想法？

reproducible data.frame : 可复制的data.frame：

Qualified <- list(
    hotan = data.frame( begin=c(7,13,19,25,31,37,43,49,55,67,79,103,31,49,55,67), 
                        end=  c(10,16,22,28,34,40,46,52,58,70,82,106,34,52,58,70), 
                        pos.score=c(11,19,8,2,6,14,25,10,23,28,15,17,6,10,23,28)),
    aksu = data.frame( begin=c(12,21,30,39,48,57,66,84,111,30,48,66,84), 
                       end=  c(15,24,33,42,51,60,69,87,114,33,51,69,87), 
                       pos.score=c(5,11,15,23,9,13,2,10,16,15,9,2,10)),
    korla = data.frame( begin=c(6,14,22,30,38,46,54,62,70,78,6,30,46,70), 
                        end=c(11,19,27,35,43,51,59,67,75,83,11,35,51,75), 
                        pos.score=c(9,16,12,3,20,7,11,13,14,17,9,3,7,14))
)

unQualified <- list(
    hotan = data.frame( begin=c(21,33,57,69,81,117,129,177,225,249,333,345,33,81,333), 
                        end=  c(26,38,62,74,86,122,134,182,230,254,338,350,38,86,338), 
                        pos.score=c(7,34,29,14,23,20,11,30,19,17,6,4,34,23,6)),
    aksu = data.frame( begin=c(13,23,33,43,53,63,73,93,113,123,143,153,183,33,63,143), 
                       end=  c(19,29,39,49,59,69,79,99,119,129,149,159,189,39,69,149), 
                       pos.score=c(5,13,32,28,9,11,22,12,23,3,6,8,16,32,11,6)),
    korla = data.frame( begin=c(23,34,45,56,67,78,89,122,133,144,166,188,56,89,144), 
                        end=c(31,42,53,64,75,86,97,130,141,152,174,196,64,97,152), 
                        pos.score=c(3,10,19,17,21,8,18,14,4,9,12,22,17,18,9))
)

Edit : 编辑：

I did categorize my data in this way : 我确实以这种方式对数据进行了分类：

singleDF <- 
    bind_rows(c(Qualified = Qualified, Unqualified = unQualified), .id = "id") %>% 
    tidyr::separate(id, c("group", "list")) %>%
    mutate(elm = ifelse(pos.score >= 10, "valid", "invalid")) %>% 
    arrange(list, group, desc(elm))

res <- singleDF %>% split(list(.$list, .$elm, .$group))

This is my desired plot: 这是我想要的情节：

Note that in valid , invalid category, I need duplicate removal for data.frame, while Qualified , UnQualified category, I'll keep these repeated rows. 请注意，在valid ， invalid类别中，我需要对data.frame进行重复删除，而在Qualified ， UnQualified类别中，我将保留这些重复的行。

How can I achieve my desired plot ? 如何获得理想的情节？ How can I make this happen by using ggplot2 package ? 如何通过使用ggplot2软件包来实现此ggplot2 ？ Any idea please ? 有什么想法吗？ Thanks in advance :) 提前致谢：）

Answer 1

Something like this perhaps?: 也许是这样的：

library(tidyverse)
library(cowplot)
theme_set(theme_grey())

p1 <- ggplot(filter(singleDF, list == "aksu"), 
             aes(group, fill = elm)) +
  geom_bar() +
  ylim(0, 16) +
  theme(legend.position = 'top', legend.title = element_blank(), axis.title.x = element_blank())

p2 <- ggplot(filter(singleDF, list == "aksu") %>% distinct(), 
             aes(elm, fill = group)) +
  geom_bar() +
  scale_fill_discrete(h.start = 90) +
  ylim(0, 16) +
  theme(legend.position = 'top', legend.title = element_blank(), axis.title.x = element_blank())

plot_grid(p1, p2, align = 'v', nrow = 1)

Answer 2

If you want to do this for each element of a list, you can use the tidyverse packages and wrap @Axeman's answer into a function. 如果要对列表的每个元素执行此操作，则可以使用tidyverse包并将tidyverse的答案包装到函数中。 I modified @Axeman's code to get the appearance that you wish, although I don't use cowplot so I substituted gridExtra . 我修改了@Axeman的代码来获得所需的外观，尽管我不使用cowplot所以我替换了gridExtra 。

EDIT: Easy fix to get your desired plot, just simply grid.arrange the results of the map with a single row. 编辑：轻松修复即可获得所需的绘图，只需简单地将grid.arrange the map的结果单行排列即可。 I also tweaked the plot to align more with your desired output. 我还调整了情节，使其与您所需的输出更加一致。 I used geom_label to get the counts, with stat="count" and use of the ..count.. special variable. 我使用geom_label来获取计数，使用stat="count"并使用..count..特殊变量。 You can switch it for geom_text if you wish. 您可以根据需要将其切换为geom_text 。

library(tidyverse)
library(grid) #for grid.draw
library(gridExtra) #for grid.arrange

split_plot <- function(x) {

  p1 <- ggplot(x, aes(x = group)) +
    geom_bar(aes(fill = elm), color = "black") +
    geom_label(aes(label = ..count.., color = elm), stat = "count", position = position_stack()) +
    ylim(0, 16) +
    labs(y = NULL, x = NULL) +
    theme_minimal() +
    theme(legend.position = 'none',
          panel.grid = element_blank(),
          legend.title = element_blank(),
          axis.ticks.y = element_blank(),
          axis.text.y = element_blank())

  p2 <- ggplot(distinct(x), aes(x = elm)) +
    geom_bar(aes(fill = group), color = "black") +
    geom_label(aes(label = ..count.., color = group), stat = "count", position = position_stack()) +
    scale_fill_discrete(h.start = 90) +
    scale_color_discrete(h.start = 90) +
    labs(y = NULL, x = NULL) +
    ylim(0, 16) +
    theme_minimal() +
    theme(legend.position = 'none',
          panel.grid = element_blank(),
          legend.title = element_blank(),
          axis.ticks.y = element_blank(),
          axis.text.y = element_blank())

  arrangeGrob(p1, p2, nrow = 1, top = unique(x$list)) 
  }

# Call the function over `singleDF`, split by list and plot each

res <- singleDF %>% 
  split(.$list) %>% 
  map(~split_plot(.x))

# Use grid.arange to draw the grobs 
grid.arrange(grobs = res, nrow = 1)

如何获取保留或删除重复行的data.frame列表的堆栈条图？

问题描述

2 个解决方案

解决方案1
3 2017-01-02 18:15:43

解决方案2
2 已采纳 2017-01-02 19:08:37

如何获取保留或删除重复行的data.frame列表的堆栈条图？

问题描述

2 个解决方案

解决方案1 3 2017-01-02 18:15:43

解决方案2 2 已采纳 2017-01-02 19:08:37

解决方案1
3 2017-01-02 18:15:43

解决方案2
2 已采纳 2017-01-02 19:08:37