[英]SPC - Control Charts by Group in R
I want to create a statistical process control chart for each Name in this dataframe and extract the rows that are out of control for each individual Name. 我想为该数据框中的每个名称创建一个统计过程控制图,并提取每个名称不受控制的行。
Below is the dataframe: 以下是数据框:
DATE <- as.Date(c('2016-06-18', '2016-06-19', '2016-06-20',
'2016-06-21', '2016-06-22', '2016-06-23',
'2016-06-24', '2016-06-25', '2016-06-26',
'2016-06-27', '2016-06-28', '2016-06-29',
'2016-06-30', '2016-06-18', '2016-06-19',
'2016-06-20', '2016-06-21', '2016-06-22',
'2016-06-23', '2016-06-24', '2016-06-25',
'2016-06-26', '2016-06-27', '2016-06-28',
'2016-06-29', '2016-06-30', '2016-06-18',
'2016-06-19', '2016-06-20', '2016-06-21',
'2016-06-22', '2016-06-23', '2016-06-24',
'2016-06-25', '2016-06-26', '2016-06-27',
'2016-06-28', '2016-06-29', '2016-06-30'))
Name <- c('A', 'A', 'A', 'A', 'A', 'A','A', 'A', 'A', 'A', 'A', 'A', 'A',
'B', 'B', 'B', 'B', 'B', 'B','B', 'B', 'B', 'B', 'B', 'B', 'B',
'C', 'C', 'C', 'C', 'C', 'C','C', 'C', 'C', 'C', 'C', 'C', 'C')
Revenue <- c(0.08, 0.03, 0.09, 2, 0.09, 0.29, 0.56, 0.23, 0.12, 0.76, 0.23, 0.45,0.32,
0.10, 0.14, 0.80, 0.3, 0.12, 0.75, 0.20, 0.09, 0.22, 0.11, 4, 0.30, 0.45,
0.19, 0.23, 0.56, 0.77, 0.9, 9, 0.38, 0.11, 0.98, 0.87, 0.09, 0.20, 0.65)
df<-data.frame(DATE, Name, Revenue)
dput(df)
structure(list(DATE = structure(c(16970, 16971, 16972, 16973,
16974, 16975, 16976, 16977, 16978, 16979, 16980, 16981, 16982,
16970, 16971, 16972, 16973, 16974, 16975, 16976, 16977, 16978,
16979, 16980, 16981, 16982, 16970, 16971, 16972, 16973, 16974,
16975, 16976, 16977, 16978, 16979, 16980, 16981, 16982), class = "Date"),
Name = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L,
2L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L, 3L), .Label = c("A",
"B", "C"), class = "factor"), Revenue = c(0.08, 0.03, 0.09,
2, 0.09, 0.29, 0.56, 0.23, 0.12, 0.76, 0.23, 0.45, 0.32,
0.1, 0.14, 0.8, 0.3, 0.12, 0.75, 0.2, 0.09, 0.22, 0.11, 4,
0.3, 0.45, 0.19, 0.23, 0.56, 0.77, 0.9, 9, 0.38, 0.11, 0.98,
0.87, 0.09, 0.2, 0.65)), .Names = c("DATE", "Name", "Revenue"),
row.names = c(NA, -39L), class = "data.frame")
df
> df
DATE Name Revenue
1 2016-06-18 A 0.08
2 2016-06-19 A 0.03
3 2016-06-20 A 0.09
4 2016-06-21 A 2.00
5 2016-06-22 A 0.09
6 2016-06-23 A 0.29
7 2016-06-24 A 0.56
8 2016-06-25 A 0.23
9 2016-06-26 A 0.12
10 2016-06-27 A 0.76
11 2016-06-28 A 0.23
12 2016-06-29 A 0.45
13 2016-06-30 A 0.32
14 2016-06-18 B 0.10
15 2016-06-19 B 0.14
16 2016-06-20 B 0.80
17 2016-06-21 B 0.30
18 2016-06-22 B 0.12
19 2016-06-23 B 0.75
20 2016-06-24 B 0.20
21 2016-06-25 B 0.09
22 2016-06-26 B 0.22
23 2016-06-27 B 0.11
24 2016-06-28 B 4.00
25 2016-06-29 B 0.30
26 2016-06-30 B 0.45
27 2016-06-18 C 0.19
28 2016-06-19 C 0.23
29 2016-06-20 C 0.56
30 2016-06-21 C 0.77
31 2016-06-22 C 0.90
32 2016-06-23 C 9.00
33 2016-06-24 C 0.38
34 2016-06-25 C 0.11
35 2016-06-26 C 0.98
36 2016-06-27 C 0.87
37 2016-06-28 C 0.09
38 2016-06-29 C 0.20
39 2016-06-30 C 0.65
I want to write a script that will create a control chart using the qcc
package for each Name and then callout the rows that are beyond the limits for the control chart of each name. 我想编写一个脚本,该脚本将使用
qcc
包为每个名称创建一个控制图 ,然后qcc
超出每个名称的控制图限制的行。
So I know how to make a normal individual control chart for each name separately which would be this. 所以我知道如何分别为每个名称制作一个普通的单独控制图。 Name: "A" for example:
名称:“ A”,例如:
library(dplyr)
library(qcc)
target <- c("A")
testing<-filter(df, Name %in% target)
library(qcc)
my.xmr.x <- qcc(testing$Revenue, type = "xbar.one", plot=TRUE)
beyond.limits(my.xmr.x)
Result: 结果:
Result is one data point beyond the limits being just for Name A : 结果是一个数据点超出了名称A的限制:
6
My goal is to write a script that will extract the rows for all names(not just the row number like 6). 我的目标是编写一个脚本,该脚本将提取所有名称的行(而不只是像6这样的行号)。
So here is what I want the output to look like in the end: 所以这就是我希望输出最终看起来像什么:
Out_of_Control_Rows_By_Name
DATE Name Revenue
6 2016-06-21 A 2.00
11 2016-06-28 B 4.00
6 2016-06-23 C 9.00
Any help would be great, thanks! 任何帮助将是巨大的,谢谢!
require(pacman) # you don't have to use this; it's my package manager
p_load(dplyr,qcc)
tmp <- df[0,]
for(i in unique(df$Name)){
testing<-filter(df, Name == i)
my.xmr.x <- qcc(testing$Revenue, type = "xbar.one", plot=TRUE)
beyond.limits(my.xmr.x)
df2 <- df[df$Name == i,]
tmp <- rbind(tmp,df2[beyond.limits(my.xmr.x),])
}
DATE Name Revenue 4 2016-06-21 A 2 24 2016-06-28 B 4 32 2016-06-23 C 9
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.