I have a data.table
with many columns. There are 4 columns where I want to replace NA
with an 0.
I have a working solution:
claimsMonthly[is.na(claim9month),claim9month := 0
][is.na(claim10month),claim10month := 0
][is.na(claim11month),claim11month := 0
][is.na(claim12month),claim12month := 0]
However, this is quite repetitive and I wanted to reduce this by using an loop (not sure if that is the smartest idea though?):
for (i in 9:12){
claimsMonthly[is.na(paste0("claim", i, "month")), paste0("claim", i, "month") := 0]
}
When I run this loop nothing happens. I guess it is due to the pact that the paste0()
returns "claim12month"
, so I get in.na("claim12month")
. The result of that is FALSE
despite the fact that there are NA
in my data. I guess this has something to do with the quotes?
This is not the first time i have issues with using paste0()
or running loops with data.table
, so I must be missing something important here.
Any ideas how to fix this?
We can either specify the .SDcols
with the names of the columns ('nm1'), loop over the .SD
(Subset of Data.table) and assign the NA to 0 ( replace_na
from tidyr
)
library(data.table)
library(tidyr)
nm1 <- paste0("claim", 9:12, "month")
setDT(claimsMonthly)[, (nm1) := lapply(.SD, replace_na, 0), .SDcols = nm1]
Or as @jangorecki mentioned in the comments, nafill
from data.table
would be better
setDT(claimsMonthly)[, (nm1) := lapply(.SD, nafill, fill = 0), .SDcols = nm1]
or using a loop with set
, assign the columns of interest with 0 based on the NA values in each column by specifying the i
(for row index) and j
for column index/name
for(j in nm1){
set(claimsMonthly, i = which(is.na(claimsMonthly[[j]])), j =j, value = 0)
}
Or with setnafill
setnafill(claimsMonthly, cols = nm1, fill = 0)
You can use:
claimsMonthly[, 9:12][is.na(claimsMonthly[, 9:12])] <- 0
Also you can use variable names:
claimsMonthly[c("claim9month", "claim10month","claim11month","claim12month")][is.na(claimsMonthly[c("claim9month", "claim10month","claim11month","claim12month")])] <- 0
Or even better you can use a vector with all variables with "claimXXmonth" pattern.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.