简体   繁体   中英

How to count number of values less than 0 and greater than 0 in a row

I have a very large dataset and I would like to generate new columns with counts of values greater than 0 and less than 0 across rows. I would then like to add another column that divides those values (eg above0_column / below0_column)

My data looks something like this:

ID SNP1 SNP2 SNP3 SNP4
1  -0.5 0.32 1.2  -0.8
2  1.5  -1.2 0.3  -0.6
3  2.6  -3.4 0.2  5.0
4  -0.3 5.0  -1.2 -0.3

The new columns should indicate for ID 1: 2 for <0 and 2 for >0

These are the functions I tried:

data$above0<-apply(data,1,function(i) sum(i>0))

and

data$above0<- Reduce('+', lapply(data,'>',0))

Both generated a new column, however, there were no values populated within the columns. When I looked at the column for "above0" it was filled with NAs. Is there another straightforward function I could use to generate new columns with counts >0 and <0? And ultimately column1/column2?

You could use rowSums (which should be faster than your original apply ) :

dat$gt0 <- rowSums(dat[,c("SNP1", "SNP2", "SNP3", "SNP4")]>0)
dat$lt0 <- rowSums(dat[,c("SNP1", "SNP2", "SNP3", "SNP4")]<0)

dat
#  ID SNP1  SNP2 SNP3 SNP4 gt0 lt0
#1  1 -0.5  0.32  1.2 -0.8   2   2
#2  2  1.5 -1.20  0.3 -0.6   2   2
#3  3  2.6 -3.40  0.2  5.0   3   1
#4  4 -0.3  5.00 -1.2 -0.3   1   3

There are multiple ways to select the variables you want, but I personally prefer explicitly selecting columns of interest with a character vector.

After this, the division is as simple as:

dat$div_gt0_lt0 <- dat$gt0 / dat$lt0

You could also do it in one shot without needing to create the intermediary columns if you want:

dat$div_gt0_lt0 <- rowSums(dat[,c("SNP1", "SNP2", "SNP3", "SNP4")]>0) / rowSums(dat[,c("SNP1", "SNP2", "SNP3", "SNP4")]<0)

We can use apply with table

data[c('below0', 'above0')] <- t(apply(data[-1], 1,
            function(x) table(sign(x[x!=0]))))
data
#  ID SNP1  SNP2 SNP3 SNP4 below0 above0
#1  1 -0.5  0.32  1.2 -0.8      2      2
#2  2  1.5 -1.20  0.3 -0.6      2      2
#3  3  2.6 -3.40  0.2  5.0      1      3
#4  4 -0.3  5.00 -1.2 -0.3      3      1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM