简体   繁体   中英

Time series data: did an event occur at similar time points across multiple people?

I have time series data which records whether an event occurred (1) or not (0) during each second of an activity.

I need to identify chunks of time during which an event occurred for most people. I would like to describe this with a number indicating the strength of the "synchrony" between people.

An example of my dataset is here: 在此处输入图像描述

In this example, we have 30 rows, where each row is one second of time. Generally, the blue boxes indicate a moderate level of synchrony between people and the red boxes indicate a higher level of synchrony between people. The spaces in-between the boxes would be a low, very low, or zero level of synchrony.

How do I analyse this?

I have looked into "synchrony analyses" time-lagged cross-correlations, and similar but these do not quite answer my question. For example, this seemed promising but does not seem to scale up to hundreds of people.

I am open to suggestions and I am just not quite sure what is available or possible for this kind of data and question.

A few criteria:

  • Should be doable in R
  • I have hundreds of people in my dataset (ie hundreds of columns)
  • I do not have pre-defined time windows- I don't want to simply analyse every 5 seconds in discrete chunks
  • I would like a "rolling window" if possible: perhaps there could be a score for each second which considers values from the surrounding 5 or 10 seconds so that one could see how the synchrony increases and decreases over time.
  • Colour-coding is great but I need a statistical value

Please read the info at the top of the tag page where it provides guidance on posting R questions to SO. In particular as mentioned there questions should not use only images for input since that means no one else can use the data without retyping it all. This time I have done it for you by creating some data in the Note at the end which we shall use in place of the image shown in the question.

We assume that the rows are partitioned by two or more consecutive all-zero rows. We call the subsets of the partition classes. If there are any all-zero rows in a class that itself is not all zeros then that class is medium and otherwise high. We use 0, 1, 2 to represent low, medium and high. This rule seems consistent with the low/medium/high marked in the image in the question.

First determine which rows are all zero, is0. We can form a preliminary partition by having each consecutive set of all=zero rows be a class in the partition and each set of not all-zero rows. Then for each row we determine the length of the class it belongs to giving len. Each all-zero row for which len > 1 (islow) is part of a low row and separates rows of the other types.

Now use islow to form the final groups g and determine for each such group if there are any all zero rows in it giving any0. For each row we already know which are the low rows from islow and of the rest they are medium if any0 is TRUE and otherwise high.

The output shows the original data, the low/medium/high level as 0/1/2 and the grouping variable g.

library(data.table) # rleid

is0 <- rowSums(Filter(function(x) all(x %in% 0:1), DF) == 0
len <- ave(is0, rleid(is0), FUN = length)
islow <- is0 & len > 1
g <- rleid(islow)
any0 <- ave(is0, g, FUN = any)
data.frame(DF, level = ifelse(islow, 0, ifelse(any0, 1, 2)), g = g)

   id a b level g
1   1 0 1     1 1
2   2 1 0     1 1
3   3 0 0     1 1
4   4 1 1     1 1
5   5 0 1     1 1
6   6 0 0     0 2
7   7 0 0     0 2
8   8 0 0     0 2
9   9 1 1     2 3
10 10 0 0     0 4
11 11 0 0     0 4
12 12 0 0     0 4
13 13 1 1     2 5
14 14 1 0     2 5
15 15 1 0     2 5
16 16 1 0     2 5
17 17 0 0     0 6
18 18 0 0     0 6
19 19 0 0     0 6
20 20 0 1     2 7

Note

set.seed(17)
n <- 20
DF <- data.frame(id = 1:n, 
  a = +(runif(n) > 0.75), 
  b = +(runif(n) > 0.75))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM