I am doing a meta-analysis in R. For each study (variable StudyID) I have multiple effect sizes. For some studies I have the same effect size multiple times depending on the level of acquaintance (variable Familiarity) between the subjects.
head(dat)
studyID A.C.Extent Visibility Familiarity p_t_cov group.size same.sex N published
1 1 3.0 5.0 1 0.0462 4 0 44 1
2 1 5.0 2.5 1 0.1335 4 0 44 1
3 1 2.5 3.0 1 -0.1239 4 0 44 1
4 1 2.5 3.5 1 0.2062 4 0 44 1
5 1 2.5 3.0 1 -0.0370 4 0 44 1
6 1 3.0 5.0 1 -0.3850 4 0 44 1
Those are the first rows of the data set. In total there are over 50 studies. Most studies look like study 1 with the same value in "Familiarity" for all effect sizes. In some studies, there are effect sizes with multiple levels of familiarity. For example study 36 as seen below.
head(dat)
studyID A.C.Extent Visibility Familiarity p_t_cov group.size same.sex N published
142 36 1.0 4.5 0 0.1233 5.00 0 311 1
143 36 3.5 3.0 0 0.0428 5.00 0 311 1
144 36 1.0 4.5 0 0.0986 5.00 0 311 1
145 36 1.0 4.5 1 -0.0520 5.00 0 311 1
146 36 1.5 2.5 1 -0.0258 5.00 0 311 1
147 36 3.5 3.0 1 0.1104 5.00 0 311 1
148 36 1.0 4.5 1 0.0282 5.00 0 311 1
149 36 1.0 4.5 2 -0.1724 5.00 0 311 1
150 36 3.5 3.0 2 0.2646 5.00 0 311 1
151 36 1.0 4.5 2 -0.1426 5.00 0 311 1
152 37 3.0 4.0 1 0.0118 5.35 0 123 0
153 37 1.0 4.5 1 -0.3205 5.35 0 123 0
154 37 2.5 3.0 1 -0.2356 5.35 0 123 0
155 37 3.0 2.0 1 0.1372 5.35 0 123 0
156 37 2.5 2.5 1 -0.1401 5.35 0 123 0
157 37 3.0 3.5 1 -0.3334 5.35 0 123 0
158 37 2.5 2.5 1 0.0317 5.35 0 123 0
159 37 1.0 3.0 1 -0.3025 5.35 0 123 0
160 37 1.0 3.5 1 -0.3248 5.35 0 123 0
Now I want for those studies that include multiple levels of familiarity, to take the rows with only one level of familiarity (two seperate versions: one with the lower, one with the higher familiarity). I think that it can be possible with the package dplyr, but I have no real code so far.
In a second step I would like to give those rows unique studyIDs for each level of familiarity (so create out of study 36 three "different" studies).
Thank you in advance!
If you want to use dplyr, you could create an alternate ID or casenum by using group_indices
:
df <- df %>%
mutate(case_num = group_indices(.dots=c("studyID", "Familiarity")))
You could do:
library(dplyr)
df %>%
group_by(studyID) %>%
mutate(nDist = n_distinct(Familiarity) > 1) %>%
ungroup() %>%
mutate(
studyID = case_when(nDist ~ paste(studyID, Familiarity, sep = "_"), TRUE ~ studyID %>% as.character),
nDist = NULL
)
Output:
# A tibble: 19 x 9
studyID A.C.Extent Visibility Familiarity p_t_cov group.size same.sex N published
<chr> <dbl> <dbl> <int> <dbl> <dbl> <int> <int> <int>
1 36_0 1 4.5 0 0.123 5 0 311 1
2 36_0 3.5 3 0 0.0428 5 0 311 1
3 36_0 1 4.5 0 0.0986 5 0 311 1
4 36_1 1 4.5 1 -0.052 5 0 311 1
5 36_1 1.5 2.5 1 -0.0258 5 0 311 1
6 36_1 3.5 3 1 0.110 5 0 311 1
7 36_1 1 4.5 1 0.0282 5 0 311 1
8 36_2 1 4.5 2 -0.172 5 0 311 1
9 36_2 3.5 3 2 0.265 5 0 311 1
10 36_2 1 4.5 2 -0.143 5 0 311 1
11 37 3 4 1 0.0118 5.35 0 123 0
12 37 1 4.5 1 -0.320 5.35 0 123 0
13 37 2.5 3 1 -0.236 5.35 0 123 0
14 37 3 2 1 0.137 5.35 0 123 0
15 37 2.5 2.5 1 -0.140 5.35 0 123 0
16 37 3 3.5 1 -0.333 5.35 0 123 0
17 37 2.5 2.5 1 0.0317 5.35 0 123 0
18 37 1 3 1 -0.302 5.35 0 123 0
19 37 1 3.5 1 -0.325 5.35 0 123 0
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.