简体   繁体   中英

Comparisons in row or column operations in R

I would like to perform an operation across a column of a data frame wherein the output is dependent on a comparison between two values.

My data frame dat is arranged like this:

region  value1
a       0
a       0
a       6
a       7
a       3
a       0
a       4
b       5
b       1
b       0

I want to create a vector of factor values based in integers. The factor value should increment every time the region value changes or every time value1 is 0. So in this case the vector I want would be equivalent to c(1, 2, 2, 2, 2, 3, 3, 4, 4, 5) .

I have code to make a factor vector that increments ONLY when value1 is 0:

fac <- as.factor(cumsum(dat[,2]==0))

and I have c-style code that gets roughly the vector I want, but runs extremely slowly on my overall data and is just plain ugly:

p <- 1
facint <- 1
for (i in 2:length(dat[,2])) {
  facint <- c(facint, p)
  if (dat[i, 2]==0 || dat[i, 1] != dat[i-1, 1])
    p = p+1
  }

fac <- as.factor(facint)

So how can I accomplish an operation such as this when operating on every row in R-style programming?

Try

cumsum(dat[,2]==0|c(FALSE,dat$region[-1]!=dat$region[-nrow(dat)]))
# [1] 1 2 2 2 2 3 3 4 4 5

Or

cumsum(!duplicated(dat[,1]) | dat[,2]==0)
#[1] 1 2 2 2 2 3 3 4 4 5

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM