I have a question very similar to a previous one but I am unable to generalize it to my case.
I have data that looks sort of like this
Within each ID, I have several Vis rows. The ones of interest to me are only a and b . The data is such that for each column in the data (V1...V7), if a is present, b is present and for all values of a , b is missing and vice versa. I would like to combine Vis's a and b for each ID group such that I have a single row (either a or b or even a new one, it doesn't really matter) without any missing data for any of the columns.
Based on the image showed, may be this helps. Here I am using actual NAs with only a couple of V columns.
We create a numeric index for column names that start with 'V' followed by numbers ('nm1'). Convert the 'data.frame' to 'data.table' ( setDT(df1)
), grouped by 'ID', we use Map
, loop over the columns specified by the index 'nm1' ( SD[, nm1, with=FALSE]
) and the 'Vis' column, replace
the 'V' column elements where the 'Vis' is either 'a' or 'b' by the non-NA element ( na.omit(x[..
), and assign the output to the numeric index.
library(data.table)
nm1 <- grep('V\\d+',colnames(df1))
setDT(df1)[, (nm1):= Map(function(x,y)
replace(x, which(y %in% c('a', 'b')), na.omit(x[y %in% c('a', 'b')])),
.SD[,-1, with=FALSE], list(.SD[[1]])), ID]
We change the 'b' values to 'a'
df1[Vis=='b', Vis := 'a']
and get the unique
rows
unique(df1)
# ID Vis V1 V2
#1: 2 a 1 2
#2: 2 c 4 5
#3: 3 a 3 4
#4: 4 a 2 3
#5: 4 c 3 4
#6: 4 d 1 1
df1 <- data.frame(ID= rep(c(2,3,4), c(3,2,4)), Vis=c('a', 'b', 'c', 'a',
'b', 'a', 'b', 'c', 'd'), V1= c(1, NA, 4, 3, NA, NA, 2, 3, 1),
V2= c(NA, 2, 5, 4, NA, 3, NA, 4, 1), stringsAsFactors=FALSE)
Just sum the values you need while removing NAs. There are more vectorized ways to do this, but the for loop is a bit clearer.
for(I in unique(df1$ID)) {
df_sub <- subset(df1, df1$ID==I & df1$Vis %in% c("a", "b"))
df1 <- subset(df1, df1$ID != I)
new_row <- apply(df_sub[, -1:-2], 2, sum, na.rm=TRUE)
df1 <- rbind(df1, c(ID=I, new_row))
}
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.