I have a data set where I'm trying to analyse how many times the same value appears consecutively. For example (based on the below data): 'The value '1' appears 3 times in a row from 1/1/2000-1/3/2000'
Example dataset
date, value
1/1/2000,1
1/2/2000,1
1/3/2000,1
1/4/2000,3
1/5/2000,3
1/6/2000,1
1/7/2000,3
1/8/2000,3
1/9/2000,3
1/10/2000,3
How should the problem be approached in either R or Excel?
As mentioned rle()
will calculate run lengths. You can then use aggregate()
to obtain the maximum run length by each grouping factor.
df <- structure(list(id = c("A", "A", "A", "B", "B"),
var = c("atc", "atc", "atc", "atc", "atc"),
val = c("aaa", "bbb", "ccc", "aaa", "eee")),
.Names = c("id","var", "val"), class = "data.frame",
row.names = c(NA, -5L))
# var and val are nonsense columns for padding
# How many times does each id appear sequentially?
df$run <- sequence(rle(df$id)$lengths)
df
id var val run
1 A atc aaa 1
2 A atc bbb 2
3 A atc ccc 3
4 B atc aaa 1
5 B atc eee 2
aggregate(df, by = list(df$id), FUN = max)
Group.1 id var val run
1 A A atc ccc 3
2 B B atc eee 2
In Excel, this can be done via an Array Formula.
Suppose your values are in column B, say in the range B2:B31
and the value you want to check for is in cell E3
, you could use the following formula:
=MAX(FREQUENCY(IF($B$2:$B$31=E3,ROW($B$2:$B$31)),IF($B$2:$B$31<>E3,ROW($B$2:$B$31))))
And enter it as an array formula (meaning, once entered, press CTRL+SHIFT+ENTER
Hope this does the trick!
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.