i have a xts time series object with numeric values for the data. str (dataTS)
An 'xts' object on 2014-02-14 14:27:00/2014-02-28 14:22:00 containing: Data: num [1:4032, 1] 51.8 44.5 41.2 48.6 46.7 ... Indexed by objects of class: [POSIXlt,POSIXt] TZ: xts Attributes:
NULL
I want to find the data points that are more than (2 * sd) away from mean. I would like to create an new dataset from it.
[,1] 2015-02-14 14:27:00 51.846 2015-02-14 14:32:00 44.508 2016-02-14 14:37:00 41.244 2015-02-14 14:42:00 48.568 2015-02-14 14:47:00 46.714 2015-02-14 14:52:00 44.986 2015-02-14 14:57:00 49.108 2015-02-14 15:02:00 1000.470 2015-02-14 15:07:00 53.404 2015-02-14 15:12:00 45.400 2015-02-14 15:17:00 3.216 2015-02-14 15:22:00 49.7204
the time series. i want to subset the outliers 3.216 and 1000.470
You can scale
your data to have zero mean and unit standard deviation. You can then directly identify individual observations that are >= 2 sd
away from the mean.
As an example, I randomly sample some data from a Cauchy distribution.
set.seed(2010);
smpl <- rcauchy(10, location = 4, scale = 3);
To illustrate, I store the sample data and scaled sample data in a data.frame
; I also mark observations that are >= 2
standard deviations away from the mean.
library(tidyverse);
df <- data.frame(Data = smpl) %>%
mutate(
Data.scaled = as.numeric(scale(Data)),
deviation_greater_than_2sd = ifelse(Data.scaled >= 2, TRUE, FALSE));
df;
# Data Data.scaled deviation_greater_than_2sd
#1 8.007951 -0.2639689 FALSE
#2 -34.072054 -0.5491882 FALSE
#3 465.099800 2.8342104 TRUE
#4 7.191778 -0.2695010 FALSE
#5 2.383882 -0.3020890 FALSE
#6 3.544079 -0.2942252 FALSE
#7 -7.002769 -0.3657119 FALSE
#8 4.384503 -0.2885287 FALSE
#9 15.722492 -0.2116796 FALSE
#10 4.268082 -0.2893179 FALSE
We can also visualise the distribution of Data.scaled
:
ggplot(df, aes(Data.scaled)) + geom_histogram();
The "outlier" is 2.8 units of standard deviation away from the mean.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.