对于 R 中另一个变量的每种情况，一个变量上具有更高值的用例

Question

I am doing a meta-analysis in R. For each study (variable StudyID) I have multiple effect sizes.我正在 R 中进行元分析。对于每项研究（可变 StudyID），我都有多个效应量。 For some studies I have the same effect size multiple times depending on the level of acquaintance (variable Familiarity) between the subjects.对于某些研究，我多次使用相同的效果大小，具体取决于受试者之间的熟悉程度（可变熟悉度）。

head(dat)
   studyID A.C.Extent Visibility Familiarity p_t_cov group.size same.sex  N published
1       1        3.0        5.0           1  0.0462          4        0  44         1
2       1        5.0        2.5           1  0.1335          4        0  44         1
3       1        2.5        3.0           1 -0.1239          4        0  44         1
4       1        2.5        3.5           1  0.2062          4        0  44         1
5       1        2.5        3.0           1 -0.0370          4        0  44         1
6       1        3.0        5.0           1 -0.3850          4        0  44         1

Those are the first rows of the data set.这些是数据集的第一行。 In total there are over 50 studies.总共有 50 多项研究。 Most studies look like study 1 with the same value in "Familiarity" for all effect sizes.大多数研究看起来像研究 1，所有效应大小的“熟悉度”值都相同。 In some studies, there are effect sizes with multiple levels of familiarity.在一些研究中，存在具有多个熟悉程度的效应大小。 For example study 36 as seen below.例如，如下所示的研究 36。

head(dat)
      studyID A.C.Extent Visibility Familiarity p_t_cov group.size same.sex  N published
142      36        1.0        4.5           0  0.1233       5.00        0  311         1
143      36        3.5        3.0           0  0.0428       5.00        0  311         1
144      36        1.0        4.5           0  0.0986       5.00        0  311         1
145      36        1.0        4.5           1 -0.0520       5.00        0  311         1
146      36        1.5        2.5           1 -0.0258       5.00        0  311         1
147      36        3.5        3.0           1  0.1104       5.00        0  311         1
148      36        1.0        4.5           1  0.0282       5.00        0  311         1
149      36        1.0        4.5           2 -0.1724       5.00        0  311         1
150      36        3.5        3.0           2  0.2646       5.00        0  311         1
151      36        1.0        4.5           2 -0.1426       5.00        0  311         1
152      37        3.0        4.0           1  0.0118       5.35        0  123         0
153      37        1.0        4.5           1 -0.3205       5.35        0  123         0
154      37        2.5        3.0           1 -0.2356       5.35        0  123         0
155      37        3.0        2.0           1  0.1372       5.35        0  123         0
156      37        2.5        2.5           1 -0.1401       5.35        0  123         0
157      37        3.0        3.5           1 -0.3334       5.35        0  123         0
158      37        2.5        2.5           1  0.0317       5.35        0  123         0
159      37        1.0        3.0           1 -0.3025       5.35        0  123         0
160      37        1.0        3.5           1 -0.3248       5.35        0  123         0

Now I want for those studies that include multiple levels of familiarity, to take the rows with only one level of familiarity (two seperate versions: one with the lower, one with the higher familiarity).现在，我希望对于那些包含多个熟悉程度的研究，仅采用一个熟悉程度的行（两个单独的版本：一个具有较低的熟悉程度，一个具有较高的熟悉程度）。 I think that it can be possible with the package dplyr, but I have no real code so far.我认为使用 dplyr 包是可能的，但到目前为止我还没有真正的代码。

In a second step I would like to give those rows unique studyIDs for each level of familiarity (so create out of study 36 three "different" studies).在第二步中，我想为每个熟悉程度的行提供唯一的 studyID（因此在研究 36 中创建三个“不同”的研究）。

Thank you in advance!先感谢您！

Answer 1

If you want to use dplyr, you could create an alternate ID or casenum by using group_indices :如果要使用 dplyr，可以使用group_indices创建备用 ID 或 casenum：

df <- df %>%
  mutate(case_num = group_indices(.dots=c("studyID", "Familiarity")))

Answer 2

You could do:你可以这样做：

library(dplyr)

df %>%
  group_by(studyID) %>%
  mutate(nDist = n_distinct(Familiarity) > 1) %>%
  ungroup() %>%
  mutate(
    studyID = case_when(nDist ~ paste(studyID, Familiarity, sep = "_"), TRUE ~ studyID %>% as.character),
    nDist = NULL
  )

Output:输出：

# A tibble: 19 x 9
   studyID A.C.Extent Visibility Familiarity p_t_cov group.size same.sex     N published
   <chr>        <dbl>      <dbl>       <int>   <dbl>      <dbl>    <int> <int>     <int>
 1 36_0           1          4.5           0  0.123        5           0   311         1
 2 36_0           3.5        3             0  0.0428       5           0   311         1
 3 36_0           1          4.5           0  0.0986       5           0   311         1
 4 36_1           1          4.5           1 -0.052        5           0   311         1
 5 36_1           1.5        2.5           1 -0.0258       5           0   311         1
 6 36_1           3.5        3             1  0.110        5           0   311         1
 7 36_1           1          4.5           1  0.0282       5           0   311         1
 8 36_2           1          4.5           2 -0.172        5           0   311         1
 9 36_2           3.5        3             2  0.265        5           0   311         1
10 36_2           1          4.5           2 -0.143        5           0   311         1
11 37             3          4             1  0.0118       5.35        0   123         0
12 37             1          4.5           1 -0.320        5.35        0   123         0
13 37             2.5        3             1 -0.236        5.35        0   123         0
14 37             3          2             1  0.137        5.35        0   123         0
15 37             2.5        2.5           1 -0.140        5.35        0   123         0
16 37             3          3.5           1 -0.333        5.35        0   123         0
17 37             2.5        2.5           1  0.0317       5.35        0   123         0
18 37             1          3             1 -0.302        5.35        0   123         0
19 37             1          3.5           1 -0.325        5.35        0   123         0

对于 R 中另一个变量的每种情况，一个变量上具有更高值的用例

问题描述

2 个解决方案

解决方案1
1 2020-03-04 20:49:34

解决方案2
0 已采纳 2020-03-04 20:36:44

对于 R 中另一个变量的每种情况，一个变量上具有更高值的用例

问题描述

2 个解决方案

解决方案1 1 2020-03-04 20:49:34

解决方案2 0 已采纳 2020-03-04 20:36:44

解决方案1
1 2020-03-04 20:49:34

解决方案2
0 已采纳 2020-03-04 20:36:44