R中两列的频率统计

Question

I have two columns in data frame我在数据框中有两列

I want to count frequency of both columns and get the result in this format我想计算两列的频率并以这种格式获得结果

  y    m Freq
 2010  1 2
 2010  2 2
 2010  3 1
 2011  1 1
 2011  2 1

Answer 1

If your data is dataframe df with columns y and m如果您的数据是带有y和m列的数据框df

library(plyr)
counts <- ddply(df, .(df$y, df$m), nrow)
names(counts) <- c("y", "m", "Freq")

Answer 2

I haven't seen a dplyr answer yet.我还没有看到dplyr 的答案。 The code is rather simple.代码比较简单。

library(dplyr)
rename(count(df, y, m), Freq = n)
# Source: local data frame [5 x 3]
# Groups: V1 [?]
#
#       y     m  Freq
#   (int) (int) (int)
# 1  2010     1     2
# 2  2010     2     2
# 3  2010     3     1
# 4  2011     1     1
# 5  2011     2     1

Data:数据：

df <- structure(list(y = c(2010L, 2010L, 2010L, 2010L, 2010L, 2011L, 
2011L), m = c(1L, 1L, 2L, 2L, 3L, 1L, 2L)), .Names = c("y", "m"
), class = "data.frame", row.names = c(NA, -7L))

Answer 3

A more idiomatic data.table version of @ugh's answer would be: @ugh 答案的更惯用的 data.table 版本是：

library(data.table) # load package
df <- data.frame(y = c(rep(2010, 5), rep(2011,2)), m = c(1,1,2,2,3,1,2)) # setup data
dt <- data.table(df) # transpose to data.table
dt[, list(Freq =.N), by=list(y,m)] # use list to name var directly

Answer 4

Using sqldf :使用sqldf ：

sqldf("SELECT y, m, COUNT(*) as Freq
       FROM table1
       GROUP BY y, m")

Answer 5

If you had a very big data frame with many columns or didn't know the column names in advance, something like this might be useful:如果您有一个包含许多列的非常大的数据框，或者事先不知道列名，这样的操作可能会很有用：

library(reshape2)
df_counts <- melt(table(df))
names(df_counts) <- names(df)
colnames(df_counts)[ncol(df_counts)] <- "count"
df_counts    

  y    m     count
1 2010 1     2
2 2011 1     1
3 2010 2     2
4 2011 2     1
5 2010 3     1
6 2011 3     0

Answer 6

Here is a simple base R solution using table() and as.data.frame()这是一个使用table()和as.data.frame()的简单基本R解决方案

df2 <- as.data.frame(table(df1))
# df2 
     y m Freq
1 2010 1    2
2 2011 1    1
3 2010 2    2
4 2011 2    1
5 2010 3    1
6 2011 3    0

df2[df2$Freq != 0, ]
# output
     y m Freq
1 2010 1    2
2 2011 1    1
3 2010 2    2
4 2011 2    1
5 2010 3    1

Data数据

df1 <- structure(list(y = c(2010L, 2010L, 2010L, 2010L, 2010L, 2011L, 
                           2011L), m = c(1L, 1L, 2L, 2L, 3L, 1L, 2L)), .Names = c("y", "m"
                           ), class = "data.frame", row.names = c(NA, -7L))

Answer 7

library(data.table)

oldformat <- data.table(oldformat)  ## your orignal data frame
newformat <- oldformat[,list(Freq=length(m)), by=list(y,m)]

Answer 8

Here another approach that I found here :这是我在这里找到的另一种方法：

df<- structure(list(y = c(2010L, 2010L, 2010L, 2010L, 2010L, 2011L, 
                           2011L), m = c(1L, 1L, 2L, 2L, 3L, 1L, 2L)), .Names = c("y", "m"
                           ), class = "data.frame", row.names = c(NA, -7L))

Two options:两种选择：

aggregate(cbind(count = y) ~ m, 
          data = df, 
          FUN = function(x){NROW(x)})

or或者

aggregate(cbind(count = y) ~ m, 
          data = df, 
          FUN = length)

R中两列的频率统计

问题描述

8 个解决方案

解决方案1
39 已采纳 2012-06-04 10:40:28

解决方案2
13 2016-04-19 00:09:20

解决方案3
8 2015-05-25 13:40:02

解决方案4
5 2012-06-04 10:11:51

解决方案5
4 2012-06-04 14:23:47

解决方案6
4 2019-03-05 15:53:27

解决方案7
3 2013-01-04 23:12:11

解决方案8
-1 2022-03-21 10:52:59

R中两列的频率统计

问题描述

8 个解决方案

解决方案1 39 已采纳 2012-06-04 10:40:28

解决方案2 13 2016-04-19 00:09:20

解决方案3 8 2015-05-25 13:40:02

解决方案4 5 2012-06-04 10:11:51

解决方案5 4 2012-06-04 14:23:47

解决方案6 4 2019-03-05 15:53:27

解决方案7 3 2013-01-04 23:12:11

解决方案8 -1 2022-03-21 10:52:59

解决方案1
39 已采纳 2012-06-04 10:40:28

解决方案2
13 2016-04-19 00:09:20

解决方案3
8 2015-05-25 13:40:02

解决方案4
5 2012-06-04 10:11:51

解决方案5
4 2012-06-04 14:23:47

解决方案6
4 2019-03-05 15:53:27

解决方案7
3 2013-01-04 23:12:11

解决方案8
-1 2022-03-21 10:52:59