简体   繁体   中英

remove i+1th term if reoccuring

Say we have the following data

A <- c(1,2,2,2,3,4,8,6,6,1,2,3,4)
B <- c(1,2,3,4,5,1,2,3,4,5,1,2,3)

data <- data.frame(A,B)

How would one write a function so that for A , if we have the same value in the i+1th position, then the reoccuring row is removed.

Therefore the output should like like

data.frame(c(1,2,3,4,8,6,1,2,3,4), c(1,2,5,1,2,3,5,1,2,3))

My best guess would be using a for statement, however I have no experience in these

你可以试试

  data[c(TRUE, data[-1,1]!= data[-nrow(data), 1]),]

Another option, dplyr -esque:

library(dplyr)
dat1 <- data.frame(A=c(1,2,2,2,3,4,8,6,6,1,2,3,4),
                   B=c(1,2,3,4,5,1,2,3,4,5,1,2,3))
dat1 %>% filter(A != lag(A, default=FALSE))
##    A B
## 1  1 1
## 2  2 2
## 3  3 5
## 4  4 1
## 5  8 2
## 6  6 3
## 7  1 5
## 8  2 1
## 9  3 2
## 10 4 3

using diff, which calculates the pairwise differences with a lag of 1:

data[c( TRUE, diff(data[,1]) != 0), ]

output:

   A B
1  1 1
2  2 2
5  3 5
6  4 1
7  8 2
8  6 3
10 1 5
11 2 1
12 3 2
13 4 3

Using rle

A <- c(1,2,2,2,3,4,8,6,6,1,2,3,4)
B <- c(1,2,3,4,5,1,2,3,4,5,1,2,3)

data <- data.frame(A,B)

X <- rle(data$A)
Y <- cumsum(c(1, X$lengths[-length(X$lengths)]))
View(data[Y, ])


row.names   A   B
1   1   1   1
2   2   2   2
3   5   3   5
4   6   4   1
5   7   8   2
6   8   6   3
7   10  1   5
8   11  2   1
9   12  3   2
10  13  4   3

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM