[英]R find intervals in data.table
i want to add a new column with intervals or breakpoints by group.我想按组添加一个带有间隔或断点的新列。 As an an example:
举个例子:
This is my data.table:这是我的 data.table:
x <- data.table(a = c(1:8,1:8), b = c(rep("A",8),rep("B",8)))
I have already the breakpoint or rowindices:我已经有断点或行索引:
pos <- data.table(b = c("A","A","B","B"), bp = c(3,5,2,4))
Here i can find the interval for group "A" with:在这里我可以找到组“A”的间隔:
findInterval(1:nrow(x[b=="A"]), pos[b=="A"]$bp)
How can i do this for each group.我怎样才能为每个组做到这一点。 In this case "A" and "B"?
在这种情况下“A”和“B”?
An option is to split
the datasets by 'b' column, use Map
to loop over the corresponding list
s, and apply findInterval
一个选项是按“b”列
split
数据集,使用Map
循环遍历相应的list
,并应用findInterval
Map(function(u, v) findInterval(seq_len(nrow(u)), v$bp),
split(x, x$b), split(pos, pos$b))
#$A
#[1] 0 0 1 1 2 2 2 2
#$B
#[1] 0 1 1 2 2 2 2 2
or another option is to group by 'b' from 'x', then use findInterval
by subsetting the 'bp' from 'pos' by filtering with a logical condition created based on .BY
或者另一种选择是从“x”中按“b”分组,然后通过使用基于
.BY
创建的逻辑条件进行过滤,通过将“bp”从“pos”子集来使用findInterval
x[, findInterval(seq_len(.N), pos$bp[pos$b==.BY]), b]
# b V1
# 1: A 0
# 2: A 0
# 3: A 1
# 4: A 1
# 5: A 2
# 6: A 2
# 7: A 2
# 8: A 2
# 9: B 0
#10: B 1
#11: B 1
#12: B 2
#13: B 2
#14: B 2
#15: B 2
#16: B 2
Another option using rolling join in data.table
:在
data.table
中使用滚动连接的另一个选项:
pos[, ri := rowid(b)]
x[, intvl := fcoalesce(pos[x, on=.(b, bp=a), roll=Inf, ri], 0L)]
output: output:
a b intvl
1: 1 A 0
2: 2 A 0
3: 3 A 1
4: 4 A 1
5: 5 A 2
6: 6 A 2
7: 7 A 2
8: 8 A 2
9: 1 B 0
10: 2 B 1
11: 3 B 1
12: 4 B 2
13: 5 B 2
14: 6 B 2
15: 7 B 2
16: 8 B 2
We can nest
the pos
data into list by b
and join with x
and use findInterval
to get corresponding groups.我们可以通过
b
将pos
数据nest
到列表中,并与x
连接,并使用findInterval
来获取相应的组。
library(dplyr)
pos %>%
tidyr::nest(data = bp) %>%
right_join(x, by = 'b') %>%
group_by(b) %>%
mutate(interval = findInterval(a, data[[1]][[1]])) %>%
select(-data)
# b a interval
# <chr> <int> <int>
# 1 A 1 0
# 2 A 2 0
# 3 A 3 1
# 4 A 4 1
# 5 A 5 2
# 6 A 6 2
# 7 A 7 2
# 8 A 8 2
# 9 B 1 0
#10 B 2 1
#11 B 3 1
#12 B 4 2
#13 B 5 2
#14 B 6 2
#15 B 7 2
#16 B 8 2
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.