简体   繁体   中英

R count the number of distinct number of values within a group using dplyr

I want to calculate the number of distinct number of colors for each ID value and I want the resulting dataframe to be the original dataframe + another column called count. From another post asking the same question, I got the following code, but this code doesn't seem to work for me

    ID= c('A', 'A', 'A', 'B', 'B', 'B')
    color=c('white', 'green', 'orange', 'white', 'green', 'green')

    d = data.frame (ID, color)
    d %>%
      group_by(ID) %>%
      mutate(count = n_distinct(color))

By running this code I got the following result:

      ID    color  count
      <fct> <fct>  <int>
      1 A     white      3
      2 A     green      3
      3 A     orange     3
      4 B     white      3
      5 B     green      3
      6 B     green      3

when what I want is

      ID    color  count
      <fct> <fct>  <int>
      1 A     white      3
      2 A     green      3
      3 A     orange     3
      4 B     white      2
      5 B     green      2
      6 B     green      2

Can someone tell me what I'm doing wrong or what is another way to do it using dplyr?

根据以上@akrun和@DominicComtois的评论,一旦我指定使用dplyr :: mutate而非mutate来使用dplyr的mutate,该代码就会起作用

Some notes:

# 1. Data set
df = data.frame (
  id = c('A', 'A', 'A', 'B', 'B', 'B'),
  color = c('white', 'green', 'orange', 'white', 'green', 'green'))

# 2. Desired result
df %>%
  group_by(id) %>%
  dplyr::mutate(count = n_distinct(color))

# 3. Result with a number of unique 'color's per 'id'
df %>%
  group_by(id, color) %>%
  dplyr::mutate(count = n()) %>% 
  unique()

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM