简体   繁体   English

如何获得在数据框中显示的最重复的值或名称

[英]How to achieve the most repeated values or names to show in a data frame

I have an easy question related to the library dplyr in R. 我有一个简单的问题与R中的库dplyr有关。

My actual data frame looks like this: 我的实际数据框如下所示:

Players <- data.frame(Group = c("A", "A", "A", "A", "B", "B", "B", "C","C","C"), Players= c("Jhon", "Jhon", "Jhon", "Charles", "Mike", "Mike","Carl", "Max", "Max","Max"))

:

   Group Players
      A    Jhon
      A    Jhon
      A    Jhon
      A  Charles
      B    Mike
      B    Mike
      B    Carl
      C     Max
      C     Max
      C     Max

And I would like to get another data frame with the players more repeated of each group and how many times are they listed. 我想获得另一个数据框,让每个组的球员更多重复,列出他们多少次。 So I would like to get this data frame: 所以我想得到这个数据框:

Group Players TimesListed

A    Jhon      3
B    Mike      2
B    Max       3

I have tried this: 我已经试过了:

    Station <- Players %>% group_by(Group,Players) %>% 
        summarise(TimesListed=length(Players)) %>% 
        summarise(TimesListed=max(TimesListed))

But I get a data frame without the names of the players like this: 但是我得到的数据框没有这样的播放器名称

   Group TimesListed

1      A           3
2      B           2
3      C           3

Any idea? 任何想法? Thank you! 谢谢!

This should get you what you want: 这应该给您您想要的:

library(dplyr)

Players %>% 
  group_by(Group) %>% 
  count(Players) %>% 
  top_n(1, n)

# A tibble: 3 x 3
# Groups:   Group [3]
   Group Players     n
  <fctr>  <fctr> <int>
1      A    Jhon     3
2      B    Mike     2
3      C     Max     3

You could do the following to convert the factors to characters: 您可以执行以下操作将因子转换为字符:

Players[] <- lapply(Players, as.character)

And if you need to change variable n to TimesListed , add the following to the end of the chain: 并且,如果您需要将变量n更改为TimesListed ,请将以下内容添加到链的末尾:

rename(TimesListed = n)

You can use aggregate function in base R: 您可以在基数R中使用aggregate函数:

aggregate(.~Group,dat,function(x)max(table(x)))
  Group Players
1     A       3
2     B       2
3     C       3

For completeness, here is a solution using . 为了完整 ,这是使用的解决方案。

library(data.table)

setDT(Players)

Players[, .(TimesListed = .N), by = .(Group, Players)][
  , .SD[which.max(TimesListed)], by = Group]
#    Group Players TimesListed
# 1:     A    Jhon           3
# 2:     B    Mike           2
# 3:     C     Max           3

The above solution will return the first row with maximum in TimesListed . 上面的解决方案将返回TimesListed具有最大值的第一行。 If we want to return all the rows equal to the maximum, we can do the following. 如果要返回等于最大值的所有行,则可以执行以下操作。 In this case, the two solutions lead to the same results. 在这种情况下,两种解决方案得出的结果相同。

Players[, .(TimesListed = .N), by = .(Group, Players)][
  , .SD[TimesListed == max(TimesListed)], by = Group]
#    Group Players TimesListed
# 1:     A    Jhon           3
# 2:     B    Mike           2
# 3:     C     Max           3

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM