简体   繁体   中英

r : Ignore NA values with pmax function

I'm trying to create a new column with the max values of 3 columns.

data exemple:

date          skyc1 skyc2 skyc3 
1995-01-01    0     1     3
1995-01-02    1     null  null
1995-01-03    1     3     null

I would like to get:

date          skyc1 skyc2 skyc3 max
1995-01-01    0     1     3     3
1995-01-02    1     null  null  1
1995-01-03    1     3     null  3

I tried using:

df$max <- pmax(df$skyc1,df$skyc2,df$skyc3)

But I get this:

date          skyc1 skyc2 skyc3 max
1995-01-01    0     1     3     3
1995-01-02    1     null  null  null
1995-01-03    1     3     null  null

Is it possible to consider null as 0? I have thought about replacing null to 0 but I have values that are actually 0 in my dataset...

Thanks

There is na.rm in pmax and as the values are null , we need to replace those null to NA before doing that and as "null" is a character string, the columns would be character or factor . So, we need to also change the type with type.convert before the pmax step

df[-1] <- replace(df[-1], df[-1] == "null", NA)
df <- type.convert(df, as.is = TRUE)
df$max <- pmax(df$skyc1, df$skyc2, df$skyc3, na.rm = TRUE)
df$max
#[1] 3 1 3

If there are many columns of 'skyc',then it can be automated as well

nm1 <- grep('^skyc\\d+$', names(df), value = TRUE)
df$max <- do.call(pmax, c(df[nm1], na.rm = TRUE))

data

df <-structure(list(date = c("1995-01-01", "1995-01-02", "1995-01-03"
), skyc1 = c(0L, 1L, 1L), skyc2 = c("1", "null", "3"), skyc3 = c("3", 
"null", "null")), class = "data.frame", row.names = c(NA, -3L
))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM