简体   繁体   中英

In R, comparing 2 fields across 2 rows in a dataframe

I am trying to compare 2 different fields across consecutive rows on a data frame in R and indicate the ones that are different. Below is the input data:-

 Start    End
1 Atl      Bos    
2 Bos      Har  
3 Har      NYC  
4 Stf      SFO
5 SFO      Chi

I am trying to establish a chain of movement and where the End doesn't match up to the Start of the next row I want to indicate that row. So for the above I would indicate row 4 as below:-

 Start    End    Ind
1 Atl      Bos   Y 
2 Bos      Har   Y
3 Har      NYC   Y
4 Stf      SFO   N
5 SFO      Chi   Y

I am pretty new to R, I have tried looking up this problem but cant seem to find a solution. Any help is appreciated.

An alternative would be:

> Ind <- as.character(dat$Start[-1]) == as.character(dat$End [-length(dat$End)])
> dat$Ind <- c(NA, ifelse(Ind==TRUE, "Y", "N")) 
> dat
  Start End  Ind
1   Atl Bos <NA>
2   Bos Har    Y
3   Har NYC    Y
4   Stf SFO    N
5   SFO Chi    Y

Note that your first item should be <NA>

You can do that with dplyr using mutate and lead . Note that the last item should be NA because there is no line 6 to compare SFO-CHI to.

library(dplyr)
df1  <- read.table(text=" Start    End
Atl      Bos
Bos      Har
Har      NYC
Stf      SFO
SFO      Chi", header=TRUE, stringsAsFactors=FALSE)

df1 %>%
mutate(Ind=ifelse(End==lead(Start),"Y","N"))

  Start End  Ind
1   Atl Bos    Y
2   Bos Har    Y
3   Har NYC    N
4   Stf SFO    Y
5   SFO Chi <NA>

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM