简体   繁体   English

R 计数条形图:具有大量计数的列和零或很少计数的列的倾斜数据

[英]R barplot of count: Skewed data with columns with large counts and columns with zero or few counts

The following code draws a barplot of count of x elements with y-axis in log scale.以下代码以对数刻度绘制了 y 轴的 x 元素计数条形图。

library(ggplot2)  
library(scales)

myData <- data.frame(
  x = c(rep(1, 22500), 
        rep(2, 6000), 
        rep(3, 8000), 
        rep(4, 5000), 
        rep(11, 86), 
        rep(16, 15), 
        rep(31, 1), 
        rep(32, 1), 
        rep(47, 1))
)

ggplot(myData, aes(x=x)) + 
  geom_bar(width = 0.5)+
  geom_text(stat='count', aes(label = ..count..), vjust = -1, size=3)+
  scale_y_log10(breaks = trans_breaks("log10", function(x) 10^x),
                labels = trans_format("log10", math_format(10^.x)))+
  scale_x_continuous(breaks=(seq(1:47)))

Below is the plot:下面是 plot:

在此处输入图像描述

My questions are:我的问题是:

  1. How do I remove the x-axis tick marks/labels for those columns with zero count?如何删除那些计数为零的列的 x 轴刻度线/标签?

  2. How do I show the values of column 31 , 32 , 47 better?如何更好地显示32 31 47的值? (those with count 1) (计数为 1 的人)

  3. How do I just label the count of the tallest column?我如何只计算 label 的最高列数? ( 22500 of column 1 in this case) (在这种情况下为1列的22500

One option would be for you to add a border color, which would help highlight that there is at least something in those parts of the graph:一种选择是为您添加边框颜色,这将有助于突出显示图表的这些部分中至少有一些东西:

library(tidyverse)

df <- 
  myData %>% 
  group_by(x) %>% 
  count()

df %>% 
  ggplot(aes(x = x, y = n)) +
  geom_col(color = "cyan4", fill = "cyan3") +
  geom_text(data = . %>% filter(x == 1), aes(label = n, y = n + 10000)) +
  scale_y_log10()

在此处输入图像描述

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM