简体   繁体   English

按组切割数据并创建频率表

[英]cut data by groups and create frequency table

I have a data frame df consisting of locations and hours. 我有一个包含位置和小时数的数据框df

### dummy data
set.seed(1)
location <- c("loc1", "loc2", "loc3")
locations <- sample(location, size=50, replace=TRUE)
hours <- runif(50, min=0, max=20)

df <- data.frame(locations, hours)

I can cut the data into hour blocks and create a frequency table of each block 我可以将数据切成小时块,并创建每个块的频率表

### cut data and create frequency table
c <- cut(df$hours, breaks=seq(0,20, by=1), include.lowest=TRUE)
t <- data.frame(table(c))
head(t)

      c Freq
1 [0,1]    0
2 (1,2]    4
3 (2,3]    2
4 (3,4]    0
5 (4,5]    4
6 (5,6]    2

But I can't get my head around grouping the data by locations first. 但是我无法全力以赴先按位置对数据进行分组。

How do I use the locations variable to group the data to give an output like 我如何使用locations变量将数据分组以提供类似的输出

  location    c Freq
1   loc1  [0,1]    x1
2   loc1  (1,2]    x2
3   loc1  (2,3]    x3
4   loc1  (3,4]    x4
5   loc1  (4,5]    x5
6   loc1  (5,6]    x6
    ...
    loc2  [0,1]    y1
    loc2  (1,2]    y2
    ...

You could try: 您可以尝试:

library(dplyr)
df %>% 
  mutate(hours = cut(hours, breaks=seq(0,20, by=1), include.lowest=TRUE)) %>%
  table() %>% data.frame() %>% arrange(locations, hours)

Which gives: 这使:

#  locations hours Freq
#1      loc1 [0,1]    0
#2      loc1 (1,2]    1
#3      loc1 (2,3]    1
#4      loc1 (3,4]    0
#5      loc1 (4,5]    0
#6      loc1 (5,6]    1
t <- data.frame(table(df$locations, c))
head(t[order(t$Var1), ])


   Var1     c Freq
1  loc1 [0,1]    0
4  loc1 (1,2]    1
7  loc1 (2,3]    1
10 loc1 (3,4]    0
13 loc1 (4,5]    0
16 loc1 (5,6]    1

or cbind them first. cbind它们。 Probably safer if you plan to work with the data later. 如果您计划以后使用数据,可能会更安全。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM