简体   繁体   中英

How to remove specific (side-by-side) duplicates in r?

Suppose I have the following string:

l1 = c(0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1)

and I only want to keep the "FIRST new 1", that is, my desire outcome of the above strong is:

l1 = c(0, 0, 1, 0, 0, 0, 1, 0, 0, 0, 0, 1, 0, 0, 0, 0, 1)

I tried to shift and subtract the lists, whatever is not 1, set to 0; but this way doesn't work.

You may try (base R way)

x <- c(0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1)
y <- rle(x)
z<- cumsum(y$lengths)[y$values == 0] + 1
w <- rep(0, length(x))
w[z] <- 1
w

 [1] 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 1

dplyr way

library(dplyr)
library(xts)
library(data.table)

x <- data.frame(
  l1 = c(0, 0, 1, 1, 1, 0, 1, 0, 0, 0, 0, 1, 1, 1, 1, 0, 1)
)
x %>%
  mutate(y = rleid(l1)) %>%
  group_by(y) %>%
  mutate(l1 = ifelse((y %% 2) == first(l1) & row_number(y)>1, 0, l1)) %>%
  ungroup %>%
  select(-y) %>%
  pull(l1)


 [1] 0 0 1 0 0 0 1 0 0 0 0 1 0 0 0 0 1

Clumsy way

bool IsNewOneAppeared = 0
for(int i;i<c.length;i++)
{
   if(IsNewOneAppeared )
   c[i]= 0;
   else if(c[i] equal 1)
   {
     keep 1;
     IsNewOneAppeared =1;
   }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM