简体   繁体   English

根据注释ggplot2的条件创建间隔

[英]Create intervals based on condition for annotate ggplot2

I have a vector called ranks with values from -6 to 6 and I want to create intervals of number of values based on value intervals such as (2, Inf) [2, 1.25) etc that include the number of values in that interval, plus the ones in the previous interval.我有一个名为ranks的向量,其值从 -6 到 6,我想根据值区间(例如 (2, Inf) [2, 1.25) 等包含该区间中的值数)创建值数区间,加上前一个间隔中的那些。 In other words, for interval of values (2, Inf) I want to obtain the number of values in this interval plus those in the previous one.换句话说,对于值的区间 (2, Inf) 我想获得这个区间中的值加上前一个区间中的值的数量。

To do this I used a very primitive approach:为此,我使用了一种非常原始的方法:

   xmin <- c(0, sum(ranks>2),
            sum(ranks>2) + sum(ranks>1.25),
            sum(ranks>2) + sum(ranks>1.25) + sum(ranks>0.75),
            sum(ranks>2) + sum(ranks>1.25) + sum(ranks>0.5) + sum(ranks>0.25)) 
   xmax <- c(c(sum(ranks>2),
            sum(ranks>2) + sum(ranks>1.25),
            sum(ranks>2) + sum(ranks>1.25) + sum(ranks>0.75),
            sum(ranks>2) + sum(ranks>1.25) + sum(ranks>0.5) + sum(ranks>0.25))-1,
            length(ranks))

Where xmin is the start of each interval and xmax the end of each interval.其中 xmin 是每个间隔的开始,xmax 是每个间隔的结束。 But I believe there is a much better straightforward way to do it.但我相信有一种更好的直接方法可以做到这一点。

Overall I'd like to find the values separating in this intervals: +Inf, 2, 1, 0.5, 0, -0.5, -1, -2, -Inf总的来说,我想找到在这个间隔中分开的值:+Inf、2、1、0.5、0、-0.5、-1、-2、-Inf

PS: I'll be using these to annotate x axis in ggplot2 as shown below (see the color scale from red to blue, those are rectangles with specific x and y delimiters) PS:我将使用这些来注释 ggplot2 中的 x 轴,如下所示(参见从红色到蓝色的色标,这些是具有特定 x 和 y 分隔符的矩形) 在此处输入图像描述

Try this:尝试这个:

c(1, 1, 2, 3, 2, 1, 4, 2, 5, 6, 2, 5, 3) %>% 
  cut(c(0, 2, 4, 6)) %>% 
  table() %>%
  cumsum()
(0,2] (2,4] (4,6] 
    7    10    13 

UPD: just noticed - you're arranging your intervals in reverse. UPD:刚刚注意到 - 你正在反向安排你的间隔。 I think the easiest way would be to convert the table to numeric and rev erse it before cumsum ing我认为最简单的方法是将表格转换为数字并在rev之前将其cumsum

I might still have misunderstood, but in the end it's just a matter of counting how many genes are in each interval, correct?我可能仍然误解了,但最后只是计算每个区间有多少个基因,对吗?

Basically you're re-inventing a stacked bar.基本上你是在重新发明一个堆叠条。

library(ggplot2)

# random data
set.seed(1)
rank <- sample(-6:6, 11000, replace = T)
# vector of your cuts
my_cuts <- c(-Inf, -2, -1, -0.5, 0, 0.5, 1, 2, Inf)
## make a data frame and cut the ranks
genes <-  data.frame(rank)
genes$cuts <- cut(genes$rank, my_cuts)

## just use geom_bar
ggplot(genes) +
  geom_bar(aes(y = 1, fill = cuts)) +
  ## now you can simply use one of the scale functions
  scale_fill_brewer(palette =  "Reds")

Created on 2022-05-31 by the reprex package (v2.0.1)reprex 包于 2022-05-31 创建 (v2.0.1)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM