简体   繁体   English

如何计算R中给定间隔的观测数量?

[英]How do I count the number of observations at given intervals in R?

I have data which includes variables for the hour, the minute, and the second for each observation. 我有数据,包括每个观察的小时,分​​钟和秒的变量。 I want to count the number of observations before 3am, all observations before 6am, all observations before 9am and so on. 我想在凌晨3点之前计算观测次数,在早上6点之前进行所有观测,在上午9点之前进行所有观测,依此类推。 Any help on this would be hugely appreciated. 任何有关这方面的帮助将非常感激。

Example of the data: 数据示例:

day    hour    minute   second
01       17        10       03
01       17        14       20
01       17        25       27
01       17        32       39
01       17        33       40
01       17        34       10
01       17        34       14
01       17        34       16
01       17        34       21
01       17        34       23
01       17        34       25
01       17        34       31
01       17        34       36

I have about 300,000 observations like this. 我有大约300,000个像这样的观察。

hour : int 17 17 17 17 17 17 17 17 17 17 小时:int 17 17 17 17 17 17 17 17 17 17

minute: int 10 14 25 32 33 34 34 34 34 34 分钟:int 10 14 25 32 33 34 34 34 34 34

second: int 3 20 27 39 40 10 14 16 21 23 第二名:int 3 20 27 39 40 10 14 16 21 23

One approach is to create a new variable based on your binning criteria, then tabulate on that variable: 一种方法是根据您的分箱标准创建一个新变量,然后将该变量制成表格:

set.seed(1)
dat <- data.frame(hour = sample(0:23, 100, TRUE, prob = runif(24)),
                  minute = sample(0:59,100, TRUE, prob = runif(60)),
                  second = sample(0:59,100, TRUE, prob = runif(60)))

#Adjust bins accordingly
dat <- transform(dat, bin = ifelse(hour < 3,"Before 3",
                                   ifelse(hour < 6,"Before 6",
                                          ifelse(hour <9,"Before 9","Later in day"))))

as.data.frame(table(dat$bin))
          Var1 Freq
1     Before 3    7
2     Before 6   17
3     Before 9   19
4 Later in day   57

Depending on the number of bins you need, you may run into issues with the nested ifelse() statements, but that should give you a start. 根据您需要的容器数量,您可能会遇到嵌套ifelse()语句的问题,但这应该是一个开始。 Update your question with more details if you get stuck. 如果您遇到困难,请更新您的问题并提供更多详情

How about length(which(data$hour <=2 )) ? length(which(data$hour <=2 ))怎么样length(which(data$hour <=2 )) I used 2 o'clock here to avoid having to deal with minutes and seconds in the first place. 我在这里使用了2点,以避免在第一时间处理分钟和秒钟。 Then loop or apply over all the different hours you want to count. 然后循环或apply您想要计算的所有不同时间。

If you need to restart your count every day, then make use of the data$day value similarly. 如果您需要每天重新开始计数,请同样使用数据$ day值。

This approach gives you more flexibility if you decide you need different times. 如果您决定需要不同的时间,这种方法可以提供更大的灵活性。 You can find n below any time point (not just hours). 您可以在任何时间点(不仅仅是几小时)找到n。 Because I'm lazy I made this work treating everything as characters. 因为我很懒,所以我把这一切都视为人物。

#1.  Create a fake data set as chase did
set.seed(1)
dat <- data.frame(hour = sample(0:23, 100, TRUE, prob = runif(24)),
                  minute = sample(0:59,100, TRUE, prob = runif(60)),
                  second = sample(0:59,100, TRUE, prob = runif(60)))

#2.  Create a function to turn your single digits double and everything into character 
dig <- function(x){ 
    ifelse(nchar(as.character(x))<2, paste("0", as.character(x), sep=""),
        as.character(x))
}

#3.  Use the dig function to make a character dataframe    
dat <- data.frame(sapply(dat, dig))

#4.  Paste hour minute and second together into new character vector
dat <- transform(dat, time=as.numeric(paste(hour, minute, second,sep="")))

#5.  function to take that character vector and compare it to the cut off time    
n.obs <- function(var, hour='0', min='00', sec='00', pm=FALSE){
    hour <- if(pm) as.character(as.numeric(hour) + 12) else hour
    bench <- as.numeric(paste(hour, min, sec, sep=""))
    length(var[var<=bench])
}

#try it out
n.obs(dat$time, '2')
n.obs(dat$time, '2', pm=T)
n.obs(dat$time, '14', pm=F)  #notice same as above because pm=F
n.obs(dat$time, hour='14', min='30', pm=F)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何计算每列的观察次数(1s) - How do I count number of observations (1s) per column 如何计算分组行/观察的数量? - How do I count the number of grouped rows/observations? 给定间隔的大小和数量,如何使用间隔自动填充矩阵? - How do I automatically populate a matrix with intervals given the size and number of the intervals? 如何计算R中的观察数量,如Stata命令计数 - How to count the number of observations in R like Stata command count 如何计算 R 中 2 个时间戳之间的观察值(给出的示例)? - How to count observations between 2 timestamps in R (Example given)? 您如何计算多列中的观察数并使用 mutate 将计数作为 R 中的新列? - How do you count the number of observations in multiple columns and use mutate to make the counts as new columns in R? 如何计算 R 中任意两个给定值连续出现的次数? - How do I count the number of times any two given values occur together in a row in R? 如何 plot 具有时间间隔数据集的观察次数 - How to plot the number of observations with time intervals dataset 如何计算r中分组数据帧中每列中的观察次数 - How to count number of observations in each column in a grouped dataframe in r 如何编写函数以基于R中的特定条件对观察次数进行计数? - How to write a function to count the number of observations based on specific conditions in R?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM