根据多个条件汇总计数

Question

I am trying to get a summary of my data based on combinations of two variables.我试图根据两个变量的组合来总结我的数据。 The following code used to work on the data:以下代码用于处理数据：

df <- data_frame(fc = runif(1000, -5, 5),
           padj = runif(1000, 0, 1))

df %>% 
  summarise(
    dn_red = count(fc < -1.5, padj <= 0.1),
    dn_pink = count(fc < -1.5, padj >= 0.1),
    dn_blue = count(fc>-1.5 & fc< 0, padj <= 0.1),
    dn_grey = count(fc>-1.5 & fc< 0, padj >= 0.1),
    up_red = count(fc > 1.5, padj <= 0.1),
    up_pink = count(fc > 1.5, padj >= 0.1),
    up_blue = count(fc < 1.5 & fc > 0, padj <= 0.1),
    up_grey = count(fc < 1.5 & fc > 0, padj >= 0.1)
  )

Running it after a couple of months since writing it throws the following error:在编写它几个月后运行它会引发以下错误：

Error: Problem with `summarise()` input `dn_red`.
x no applicable method for 'count' applied to an object of class "logical"
ℹ Input `dn_red` is `count(fc < -1.5, padj <= 0.1)`.

I can see that count outputs a tibble with logical vectors corresponding to the conditions.我可以看到 count 输出一个带有与条件相对应的逻辑向量的小标题。 What I am trying to get out of it is a summary of the counts, where both the conditions are TRUE.我试图从中得到的是计数的摘要，其中两个条件都为真。 The code above used to do just that...上面的代码曾经这样做......

Answer 1

You perhaps want sum instead of count !您可能想要sum而不是count ！

set.seed(1)
df <- data.frame(fc = runif(1000, -5, 5),
                 padj = runif(1000, 0, 1))

df %>% 
  summarise(
    dn_red = sum(fc < -1.5, padj <= 0.1),
    dn_pink = sum(fc < -1.5, padj >= 0.1),
    dn_blue = sum(fc>-1.5 & fc< 0, padj <= 0.1),
    dn_grey = sum(fc>-1.5 & fc< 0, padj >= 0.1),
    up_red = sum(fc > 1.5, padj <= 0.1),
    up_pink = sum(fc > 1.5, padj >= 0.1),
    up_blue = sum(fc < 1.5 & fc > 0, padj <= 0.1),
    up_grey = sum(fc < 1.5 & fc > 0, padj >= 0.1)
  )

  dn_red dn_pink dn_blue dn_grey up_red up_pink up_blue up_grey
1    494    1250     269    1025    458    1214     267    1023

But this is creating overlaps.但这会造成重叠。 So you need to replace , within logical conditions with either & or |因此，您需要在逻辑条件下用&或|替换, as the case may be.视情况可以是。 See.看。

df %>% 
  summarise(
    dn_red = sum(fc < -1.5 & padj <= 0.1),
    dn_pink = sum(fc < -1.5 & padj >= 0.1),
    dn_blue = sum(fc>-1.5 & fc< 0 & padj <= 0.1),
    dn_grey = sum(fc>-1.5 & fc< 0 & padj >= 0.1),
    up_red = sum(fc > 1.5 & padj <= 0.1),
    up_pink = sum(fc > 1.5 & padj >= 0.1),
    up_blue = sum(fc < 1.5 & fc > 0 & padj <= 0.1),
    up_grey = sum(fc < 1.5 & fc > 0 & padj >= 0.1)
  )

  dn_red dn_pink dn_blue dn_grey up_red up_pink up_blue up_grey
1     44     328      20     127     40     296      18     127

If this is what you expected, then it is advisable to divide 1000 data points into eight colors.如果这是您所期望的，那么建议将1000数据点分成 8 个 colors。 Use this code instead请改用此代码

df %>% mutate(new = case_when(
  fc < -1.5 & padj <= 0.1 ~ 'dn_red',
  fc < -1.5 & padj >= 0.1 ~ 'dn_pink',
  fc > -1.5 & fc < 0 & padj <= 0.1 ~ 'dn_blue',
  fc > -1.5 & fc < 0 & padj >= 0.1 ~'dn_grey',
  fc > 1.5 & padj <= 0.1 ~ 'up_red',
  fc > 1.5 & padj >= 0.1 ~ 'up_pink',
  fc < 1.5 & fc > 0 & padj <= 0.1 ~ 'up_blue',
  fc < 1.5 & fc > 0 & padj >= 0.1 ~ 'up_grey',
  TRUE ~ 'others'
)) %>% count(new)

      new   n
1 dn_blue  20
2 dn_grey 127
3 dn_pink 328
4  dn_red  44
5 up_blue  18
6 up_grey 127
7 up_pink 296
8  up_red  40

or better use janitor to have a frequency count或更好地使用janitor进行频率计数

df %>% mutate(new = case_when(
  fc < -1.5 & padj <= 0.1 ~ 'dn_red',
  fc < -1.5 & padj >= 0.1 ~ 'dn_pink',
  fc > -1.5 & fc < 0 & padj <= 0.1 ~ 'dn_blue',
  fc > -1.5 & fc < 0 & padj >= 0.1 ~'dn_grey',
  fc > 1.5 & padj <= 0.1 ~ 'up_red',
  fc > 1.5 & padj >= 0.1 ~ 'up_pink',
  fc < 1.5 & fc > 0 & padj <= 0.1 ~ 'up_blue',
  fc < 1.5 & fc > 0 & padj >= 0.1 ~ 'up_grey',
  TRUE ~ 'others'
)) %>% janitor::tabyl(new) %>%
  janitor::adorn_totals()

     new    n percent
 dn_blue   20   0.020
 dn_grey  127   0.127
 dn_pink  328   0.328
  dn_red   44   0.044
 up_blue   18   0.018
 up_grey  127   0.127
 up_pink  296   0.296
  up_red   40   0.040
   Total 1000   1.000

根据多个条件汇总计数

问题描述

1 个解决方案

解决方案1
1 已采纳 2021-04-27 10:42:04

根据多个条件汇总计数

问题描述

1 个解决方案

解决方案1 1 已采纳 2021-04-27 10:42:04

解决方案1
1 已采纳 2021-04-27 10:42:04