我可以在按 R 中的另一列分组的同时列出一列的唯一值吗？

Question

I have the following columns:我有以下几列：

 session  condition           codes    

      15 anxiety                 1       
      15 depression              1        
      15 bipolar                 1
      15 high blood pressure     3
      15 panic attacks           1
      66 hypertension            5
      66 high blood pressure     3
      66 anxiety                 1
      66 panic attacks           1
      75 schizophrenia           1
      32 muscular dystrophy      4
      32 anxiety                 1      
      32 depression              1
      32 panic attacks           1

I want to make a new column with just the unique codes per session and then leave the rest of the rows for that session blank.我想用每个会话的唯一代码创建一个新列，然后将该会话的其余行留空。 I know this logically doesn't make sense because this third column doesn't really match up with the first.我知道这在逻辑上没有意义，因为第三列与第一列并不真正匹配。 If it needs to be in a new object or list or something that is fine.如果它需要在一个新的对象或列表或其他东西中。

 session  condition           codes     unique_codes

      15 anxiety                 1       1
      15 depression              1       3
      15 bipolar                 1
      15 high blood pressure     3
      15 panic attacks           1       
      66 hypertension            5       5
      66 high blood pressure     3       3
      66 anxiety                 1       1
      66 panic attacks           1
      75 schizophrenia           1       1
      32 muscular dystrophy      4       4
      32 anxiety                 1       1
      32 depression              1
      32 panic attacks           1

I have tried:我试过了：

conditions=conditions %>%
  group_by(session)%>%
  mutate(unique_codes=unique(conditions$codes))

However I get an error that says "must be length 5 (the group size) or one, not 4", which I assume is because I want the rest of the rows blank.但是，我收到一条错误消息，指出“长度必须为 5（组大小）或 1，而不是 4”，我认为这是因为我希望其余行为空白。 Does anyone know a way around this?有谁知道解决这个问题的方法？ Thank you!!谢谢！！

Answer 1

The lengths are the issue, we can either paste it together or create a list column长度是问题，我们可以将其粘贴在一起或创建一个列表列

library(dplyr)
conditions %>%
    group_by(session)%>% 
    mutate(unique_codes = toString(unique(codes)))

Or another option is to set the length same by padding NA at the end或者另一种选择是通过在末尾填充NA来设置相同的length

conditions %>%
   group_by(session) %>%
   mutate(unique_codes = `length<-`(unique(codes), n()))
# A tibble: 14 x 4
# Groups:   session [4]
#   session condition           codes unique_codes
#     <int> <chr>               <int>        <int>
# 1      15 anxiety                 1            1
# 2      15 depression              1            3
# 3      15 bipolar                 1           NA
# 4      15 high blood pressure     3           NA
# 5      15 panic attacks           1           NA
# 6      66 hypertension            5            5
# 7      66 high blood pressure     3            3
# 8      66 anxiety                 1            1
# 9      66 panic attacks           1           NA
#10      75 schizophrenia           1            1
#11      32 muscular dystrophy      4            4
#12      32 anxiety                 1            1
#13      32 depression              1           NA
#14      32 panic attacks           1           NA

The OP mentioned about n() not working (could be a dplyr version issue). OP 提到n()不起作用（可能是dplyr版本问题）。 In that case, length should work在这种情况下， length应该起作用

conditions %>%
   group_by(session) %>%
   mutate(unique_codes = `length<-`(unique(codes), length(codes)))

data数据

conditions <- structure(list(session = c(15L, 15L, 15L, 15L, 15L, 66L, 66L, 
66L, 66L, 75L, 32L, 32L, 32L, 32L), condition = c("anxiety", 
"depression", "bipolar", "high blood pressure", "panic attacks", 
"hypertension", "high blood pressure", "anxiety", "panic attacks", 
"schizophrenia", "muscular dystrophy", "anxiety", "depression", 
"panic attacks"), codes = c(1L, 1L, 1L, 3L, 1L, 5L, 3L, 1L, 1L, 
1L, 4L, 1L, 1L, 1L)), class = "data.frame", row.names = c(NA, 
-14L))

Answer 2

Another dplyr option could be:另一个dplyr选项可能是：

df %>%
 group_by(session) %>%
 distinct(codes) %>%
 transmute(unique_codes = codes,
           rowid = 1:n()) %>%
 right_join(df %>%
            group_by(session) %>%
            mutate(rowid = 1:n())) %>%
 ungroup() %>%
 select(-rowid)

   session unique_codes condition           codes
     <int>        <int> <chr>               <int>
 1      15            1 anxiety                 1
 2      15            3 depression              1
 3      15           NA bipolar                 1
 4      15           NA high blood pressure     3
 5      15           NA panic attacks           1
 6      66            5 hypertension            5
 7      66            3 high blood pressure     3
 8      66            1 anxiety                 1
 9      66           NA panic attacks           1
10      75            1 schizophrenia           1
11      32            4 muscular dystrophy      4
12      32            1 anxiety                 1
13      32           NA depression              1
14      32           NA panic attacks           1

我可以在按 R 中的另一列分组的同时列出一列的唯一值吗？

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-03-06 17:49:55

data数据

解决方案2
0 2020-03-06 18:25:11

我可以在按 R 中的另一列分组的同时列出一列的唯一值吗？

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-03-06 17:49:55

data数据

解决方案2 0 2020-03-06 18:25:11

解决方案1
1 已采纳 2020-03-06 17:49:55

解决方案2
0 2020-03-06 18:25:11