简体   繁体   English

根据来自另一个数据框的一系列值从一个数据框中提取值

[英]Extract values from a dataframe based on a range of values from another dataframe

I am trying to extract the index values from a dataframe ( df1 ) that represent a range of times (start - end) and that encompass the times given in another dataframe ( df2 ). 我正在尝试从一个数据帧( df1 )中提取索引值,该索引值表示一个时间范围(start-end)并包含另一个数据帧( df2 )中给出的时间。 My required output is df3 . 我需要的输出是df3

df1<-data.frame(index=c(1,2,3,4),start=c(5,10,15,20),end=c(10,15,20,25))
df2<-data.frame(time=c(11,17,18,5,5,22))
df3<-data.frame(time=c(11,17,18,5,5,22),index=c(2,3,3,1,1,4))

Is there a tidyverse solution to this? 有解决这个问题的方法吗?

You can do it with R base functions. 您可以使用R基本功能来实现。 A combination of which inside sapply and logical comparison will do the work for you. 的组合which内部sapply和逻辑比较会做的工作适合你。

 inds <- apply(df1[,-1], 1, function(x) seq(from=x[1], to=x[2]))
 index <- sapply(df2$time, function(x){
   tmp <- which(x == inds, arr.ind = TRUE);
   tmp[, "col"]
 } )
 df3 <- data.frame(df2, index)
 df3
  time index
1   11     2
2   17     3
3   18     3
4    5     1
5    5     1
6    8     1

Data: 数据:

df1<-data.frame(index=c(1,2,3,4),start=c(5,10,15,20),end=c(10,15,20,25))
df2<-data.frame(time=c(11,17,18,2,5,5,8,22))

Code: 码:

# get index values and assign it to df2 column
df2$index <- apply( df2, 1, function(x) { with(df1, index[ x[ 'time' ]  >= start & x[ 'time' ] <= end ] ) }) 

Output: 输出:

df2
#   time index
# 1   11     2
# 2   17     3
# 3   18     3
# 4    2      
# 5    5     1
# 6    5     1
# 7    8     1
# 8   22     4

Here is one option with findInterval 这是findInterval一个选项

ftx <- function(x, y) findInterval(x, y)
df3 <- transform(df2, index = pmax(ftx(time, df1$start), ftx(time, df1$end)))

df3
#   time index
#1   11     2
#2   17     3
#3   18     3
#4    5     1
#5    5     1
#6   22     4

Or another option is foverlaps from data.table 或者另一种选择是foverlapsdata.table

library(data.table)
dfN <- data.table(index = seq_len(nrow(df2)), start = df2$time, end = df2$time)
setDT(df1)
setkey(dfN, start, end)
setkey(df1, start, end)
foverlaps(dfN, df1, which = TRUE)[, yid[match(xid, dfN$index)]]
#[1] 2 3 3 1 1 4

As the OP commented about using a solution with pipes, @Jilber Urbina's solution can be implemented with tidyverse functions 正如OP所评论的使用管道解决方案一样,@ Jilber Urbina的解决方案可以使用tidyverse函数来实现

library(tidyverse)
df1 %>% 
    select(from = start, to = end) %>% 
    pmap(seq) %>% 
    do.call(cbind, .) %>% 
    list(.) %>%
    mutate(df2, new = ., 
                ind = map2(time, new, ~ which(.x == .y, arr.ind = TRUE)[,2])) %>%
    select(-new)
#   time ind
#1   11   2
#2   17   3
#3   18   3
#4    5   1
#5    5   1
#6   22   4

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用tidyverse根据来自另一个数据帧的分组值范围从数据框中提取分组值 - Extract grouped values from a dataframe based on a range of grouped values from another dataframe using tidyverse 根据 R 中另一个 DataFrame 的条件从 DataFrame 中提取值 - Extract values from a DataFrame based on condition on another DataFrame in R 根据另一个数据帧R中的值填充数据帧中的缺失值 - Fill missing values in a dataframe based on values from another dataframe R 基于另一个数据帧替换数据帧中一列的多个值 - Replacing multiple values from a column in a dataframe based on another dataframe 根据另一个数据框的条件减去数据框的值 - Subtract values in dataframe based on condition from another dataframe 根据来自另一个 dataframe 的值向 dataframe 添加一列 - Add a column to dataframe based on values from another dataframe 根据两个条件,将数据框中的NaN替换为另一个数据框中的值 - Replace NaNs in dataframe with values from another dataframe based on two criteria 如何根据另一个 dataframe 的值创建 dataframe? - how do I create a dataframe based on values from another dataframe? 根据另一个数据框中各列的范围将一个数据框中的值分组 - Group values in one dataframe based on range in columns in another dataframe 根据另一个数据帧给出的匹配值从大列表中提取数据帧 - Extract dataframes from a large list based on matching values given by another dataframe
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM