Marking outliers in time series data

Question

I have a df of thousands of entries of particular lab value for patients with each row representing one instance they had the lab taken. I am interested in looking at the change in this value over time after a surgery. If the value rises and falls back to baseline within an acute time period I need to exclude the rise, however if it rises and stays above baseline I need to keep these values. I am able to mark if the value rises past a certain threshold within a time period, but I'm unsure how to code if it returns to baseline within a particular range of time. My ultimate goal is to use a geom_smooth to trend the value over time based on a particular procedure type, but need to exclude these outliers for my graphs to be correct. Any help would be very appreciated!

My data is organized like this:

lab Date	Lab value	study ID	Acutely Past threshold
1/1/2001	2	1	NA
4/1/2001	2.3	1	N
5/2/2002	2.3	1	N
4/8/2018	1	2	NA
4/9/2018	3.8	2	Y
4/15/2018	1	2	N
5/1/2016	1.0	3	NA
5/2/2016	1.2	3	N
4/1/1997	1.0	4	NA
4/4/1997	2.5	4	Y
5/5/1997	2.5	4	N

Answer 1

For further reference, when posting data it is better to use dput in order to provide a reproducible example. I thinking something like this might work. You would need to identify the "episodes" in which the value went over the threshold. In this code, the output I think you're looking for is "episode"

df %>%
  group_by(id) %>%
  mutate(
    potential_episode_grp = (lab_value > normal_level) * data.table::rleid(lab_value > normal_level)
  ) %>%
  group_by(id, potential_episode_grp) %>%
  mutate(episode = as.integer(potential_episode_grp > 0 & any(lab_value > threshold_you_want)))

Marking outliers in time series data

Question

1 answers

solution1
0 2022-02-11 17:54:08

Marking outliers in time series data

Question

1 answers

solution1 0 2022-02-11 17:54:08

solution1
0 2022-02-11 17:54:08