简体   繁体   中英

R - How to Return All Rows Below Selected Specific Rows in a Dataframe?

So I have this this data frame with these values

page_name         activity
Home              View Page
New Project       View Page
New Project       Submit Form
New Project       View Page
Expenses          View Page
Quotes            View Page
New Project       View Page
New Project       Submit Form
New Project       View Page
Payment Claims    View Page

I'm trying to get all the pages that are two rows below the rows whose page name is 'New Project' and activity is 'Submit Form' in a new dataframe like this.

page_name         activity
Expenses          View Page
Payment Claims    View Page

I used this R code to get all the rows who follows the conditions I need.

after_newproj <- with(dat, dat[((page_name == 'New Project' & activity == 'Submit Form')),] )

Now I tried using this to get what I want to happen and it returns the same number of rows but all null.

after_newproj <- with(dat, dat[((page_name == 'New Project' & activity == 'Submit Form')),] + c(2) )

My solution is that you create additional fields which can then be filtered
Code updated....it now works.

function to help with filtering

 global.counter <- 2 fill.filler <- function(x){ if(x == "Break") global.counter <<- 0 else global.counter <<- global.counter + 1 return(global.counter) } 

code to filter out the rows needed

 df %>% mutate(fill = if_else(page_name == "New Project" & activity == "Submit Form", "Break", "Count")) %>% mutate(counter = sapply(.$fill, fill.filler)) %>% filter(counter <= 2, activity != "Submit Form") %>% select(-c(fill, counter)) 

It's important the global.counter be set at 2 otherwise the first few rows will also be included in the final selection which you want to avoid.

Hope the code is easy enough to understand.

Data

library(data.table)
df <- fread("page_name,activity
Home,View Page
New Project,View Page
New Project,Submit Form
New Project,View Page
Expenses,View Page
Quotes,View Page
New Project,View Page
New Project,Submit Form
New Project,View Page
Payment Claims,View Page", sep=",", header=T)

dplyr solution

lead-lag functions of dplyr are helpful in these cases

library(dplyr)
df[lag(df$page_name,2)=="New Project" & lag(df$activity,2)=="Submit Form",]

Output

         page_name  activity
1:        Expenses View Page
2:  Payment Claims View Page

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM