How can I add individual summary values per participant or group to a long dataframe in R, when the replacement is shorter than the original variable?

Question

I have a long dataset with about 6000 observations per participant. I would like to compute a count for one of my variables (max count is 12) and add this count into a new variable in the dataframe. However, there should be only one value entered per participant and the remaining cells may be filled with NA.

I have first attempted to create an empty variable and then tried the following mutation:

dfl$Hits <- NA

dfl$Hits <- dfl %>% 
  group_by(participant) %>% 
  filter(SpaceREsponseType == "Hit") %>% 
  count() %>%  
  mutate(id = cur_group_id()) %>% 
  mutate(id, na.rm = F)

I have also tried

dfl$Hits <- dfl %>% 
  group_by(participant) %>% 
  mutate(n = replace(rep(NA, n()), 1, sum(!is.na(SpaceREsponseType == "Hit")))) %>%
  ungroup

However, this results in the following error message:

Error: ! Assigned data ... %>% count() must be compatible with existing data. ✖ Existing data has 66619 rows. ✖ Assigned data has 142 rows. ℹ Only vectors of size 1 are recycled.

What do I need to add to make this work?

Thanks in advance and best wishes, Jasmine

Answer 1

I have created a sample DF .

The data are grouped by participant and Hit and a row number is added. with mutate)n=n()) the Hits and No Hits are count per participant .

After making the data wider the condition is added with case_when .

Then the result is brought back into the original format.

library(tidyverse)

df <- data.frame(
  participant = sample(c("A", "B", "C"), replace = T, 100),
  Hit = sample(c("Hit", "NoHit"), replace = T, 100)
)


df |>
  group_by(participant, Hit) |>
  mutate(rn = row_number()) |>
  mutate(n = n()) |>
  pivot_wider(names_from = Hit, values_from = n) |>
  ungroup() |>
  mutate(across(
    ends_with("it"),
    ~ case_when(
      rn == 1 ~ .x,
      rn > 1 ~ NA_integer_
    )
  )) |>
  pivot_longer(NoHit:Hit) |>
  select(-rn)
#> # A tibble: 114 × 3
#>    participant name  value
#>    <chr>       <chr> <int>
#>  1 A           NoHit    21
#>  2 A           Hit      12
#>  3 B           NoHit    17
#>  4 B           Hit      17
#>  5 A           NoHit    NA
#>  6 A           Hit      NA
#>  7 C           NoHit    19
#>  8 C           Hit      14
#>  9 B           NoHit    NA
#> 10 B           Hit      NA
#> # … with 104 more rows

How can I add individual summary values per participant or group to a long dataframe in R, when the replacement is shorter than the original variable?

Question

1 answers

solution1
0 2023-01-09 13:53:25

How can I add individual summary values per participant or group to a long dataframe in R, when the replacement is shorter than the original variable?

Question

1 answers

solution1 0 2023-01-09 13:53:25

solution1
0 2023-01-09 13:53:25