[英]How do I assign group level value - based on row level values - to df using dplyr
I have the following decision rules:我有以下决策规则:
RELIABILITY LEVEL DESCRIPTION
LEVEL I Multiple regression
LEVEL II Multiple regression + mechanisms specified (all interest variables)
LEVEL III Multiple regression + mechanisms specified (all interest + control vars)
The first three columns are the data upon which the 4th column should be reproduced using dplyr.前三列是第 4 列应使用 dplyr 重现的数据。
The reliability level should be the same for the whole table (model)... I want to code it using dplyr.整个表(模型)的可靠性级别应该相同......我想使用 dplyr 对其进行编码。
Here is my try so far... As you can see, I can't get it to be the same for the whole model到目前为止,这是我的尝试......正如你所看到的,我无法让整个 model 都一样
library(tidyverse)
library(readxl)
library(effectsize)
df <- read_excel("https://github.com/timverlaan/relia/blob/59d2cbc5d7830c41542c5f65449d5f324d6013ad/relia.xlsx")
df1 <- df %>%
group_by(study, table, function_var) %>%
mutate(count_vars = n()) %>%
ungroup %>%
group_by(study, table, function_var, mechanism_described) %>%
mutate(count_int = case_when(
function_var == 'interest' & mechanism_described == 'yes' ~ n()
)) %>%
mutate(count_con = case_when(
function_var == 'control' & mechanism_described == 'yes' ~ n()
)) %>%
mutate(reliable_int = case_when(
function_var == 'interest' & count_vars/count_int == 1 ~ 1)) %>%
mutate(reliable_con = case_when(
function_var == 'control' & count_vars/count_con == 1 ~ 1)) %>%
# group_by(study, source) %>%
mutate(reliable = case_when(
reliable_int != 1 ~ 1,
reliable_int == 1 ~ 2,
reliable_int + reliable_con == 2 ~ 3)) %>%
# ungroup() %>%
The code settled on is:确定的代码是:
library(tidyverse)
library(readxl)
df <- read_excel("C:/Users/relia.xlxs")
df <- df %>% select(-reliability_score)
test<-df %>% group_by(study,model,function_var) %>%
summarise(count_yes=sum(mechanism_described=="yes"),n=n(),frac=count_yes/n) %>%
mutate(frac_control=frac[function_var=="control"],
frac_interest=frac[function_var=="interest"]) %>%
mutate(reliability = case_when(
frac_control == 1 & frac_interest != 1 ~ -99,
frac_control != 1 & frac_interest != 1 ~ 2,
frac_interest == 1 & frac_control != 1 ~ 3,
frac_interest ==1 & frac_control == 1 ~ 4)) %>% group_by(study,model) %>% summarise(reliability=mean(reliability))
df_reliability<-left_join(df,test)
View(df_reliability)
However, I would prefer to do this all within one dplyr pipe. If anyone has a solution I would love to hear it...但是,我更愿意在一个 dplyr pipe 内完成所有这一切。如果有人有解决方案,我很想听听......
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.