I have multiple variables grouped together by prefixes (par___, fri___, gp___ etc) there are 29 of these groups.
Each variable has a value of 0 or 1. What I need to do is sum these groups (ie, partner___1 + partner___2 etc) and if the rowSums = 0, make each of the variables NA.
for example. My data looks like this:
par___ | par___2 | fri___1 | fri___2 |
---|---|---|---|
0 | 0 | 1 | 1 |
0 | 1 | 0 | 0 |
0 | 0 | 1 | 0 |
0 | 0 | 0 | 0 |
and I want it to look like this:
par___ | par___2 | fri___1 | fri___2 |
---|---|---|---|
NA | NA | 1 | 1 |
0 | 1 | NA | NA |
NA | NA | 1 | 0 |
NA | NA | NA | NA |
I can do it individually like this:
df<- df%>%
mutate(rowsum = rowSums(.[grep("par___", names(.))])) %>%
mutate_at(grep("par___", names(.)), funs(ifelse(rowsum == 0, NA, .))) %>%
select(-rowsum)
And I figured I could do something like this:
vars <- c('par___', "fri___','gp___')
for (i in vars) {
df<- df%>%
# creates a "rowsum" column storing the sum of columns 1:2
mutate(rowsum = rowSums(.[grep(i, names(.))])) %>%
# applies, to columns 1:2, a function that puts NA when the sum of the rows is 0
mutate_at(grep(i, names(.)), funs(ifelse(rowsum == 0, NA, .))) %>%
select(-rowsum)
}
There are no error messages but it doesn't work.
Also, I've tried mutate(across()) instead of mutate_at() and get this error:
Error: Problem with mutate()
input ..1
. x Can't convert a list to function i Input ..1
is across(grep(i, names(.)), list(ifelse(rowsum == 0, NA, .)))
.
And, I've tried list instead of funs and get this error:
Error in rowsum == 0: comparison (1) is possible only for atomic and list types
Any help would be greatly appreciated!
Thanks heaps.
A tidyverse option will be:
df %>%
stack() %>%
group_by(ind) %>%
group_by(grp = row_number(), grp2 = str_remove(ind, "_.*")) %>%
mutate(values = values + na_if(all(values==0), 1)) %>%
pivot_wider(grp, ind, values_from = values)
# A tibble: 4 x 5
# Groups: grp [4]
grp par___1 par___2 fri___1 fri___2
<int> <int> <int> <int> <int>
1 1 NA NA 1 1
2 2 0 1 NA NA
3 3 NA NA 1 0
4 4 NA NA NA NA
If on the other hand, you will prefer base R, then you could do:
d <- ave(unlist(df), row(df), sub("_.*", "", names(df))[col(df)], FUN = function(x) x * NA ^ all(x==0))
array(d, dim(df), dimnames(df))
par___1 par___2 fri___1 fri___2
1 NA NA 1 1
2 0 1 NA NA
3 NA NA 1 0
4 NA NA NA NA
Take note that the last one is a matrix and you can turn it to a dataframe.
Base R option using split.default
:
do.call(cbind, unname(lapply(split.default(df,
sub('(\\w+)_.*', '\\1', names(df))), function(x) {
x[rowSums(x) == 0, ] <- NA
x
})))
# fri___1 fri___2 par___ par___2
#1 1 1 NA NA
#2 NA NA 0 1
#3 1 0 NA NA
#4 NA NA NA NA
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.