简体   繁体   中英

How to create a confusion matrix containing multiple judgments in R?

I've got a data set from two raters judging a set of videoclips on multiple (binary) criteria. I'd like to plot a confusion matrix to better understand their agreement/disagreement. But all the examples I've found so far are for cases where each judge only rates on one criteria per clip. In my case, judges rate every criteria for each clip.

Say I have 4 binary criteria (A_Con..A_Mod), judged by two raters (A and B), for a set of videoclips (in this case 80):

str (mydata)
'data.frame':   160 obs. of  6 variables:
 $ A_Con: int  0 0 0 0 0 0 0 0 0 0 ...
 $ A_Dom: int  0 0 0 1 0 0 0 0 0 0 ...
 $ A_Met: int  0 0 0 0 0 0 1 0 0 1 ...
 $ A_Mod: int  0 0 0 1 0 1 0 0 0 1 ...
 $ Rater: Factor w/ 2 levels "A","B": 2 2 2 2 2 2 2 2 2 2 ...
 $ Clip : int  1 2 3 4 5 6 7 8 9 10 ...

I can melt this into:

> str(mymolten)
'data.frame':   640 obs. of  4 variables:
 $ Rater   : Factor w/ 2 levels "A","B": 2 2 2 2 2 2 2 2 2 2 ...
 $ Clip    : int  1 2 3 4 5 6 7 8 9 10 ...
 $ variable: Factor w/ 4 levels "A_Con","A_Dom",..: 1 1 1 1 1 1 1 1 1 1 ...
 $ value   : int  0 0 0 0 0 0 0 0 0 0 ...

But I can't figure out how to cast it into a confusion matrix that would count the combinations (which are not nearly so perfect as this):

                        Rater B
              A_Con  A_Dom  A_Met  A_Mod
         A_Con  19      1      0      0
Rater A  A_Dom   1     20      0      0
         A_Met   0      0     20      5
         A_Mod   0      2      0     20

It seems like the table() function is the way to go, but how to format the data?

This may not be the simplest solution. You can separate the data for the two raters, and merge the resulting data.frames.

# Sample data
n <- 80
d0 <- data.frame(
  A_Con = round(runif(2*n)),
  A_Dom = round(runif(2*n)),
  A_Met = round(runif(2*n)),
  A_Mod = round(runif(2*n)),
  Rater = rep(c("A","B"), n),
  Clip = rep(1:n,each=2)
)

library(reshape2)
library(plyr)
d <- melt(d0, id.vars=c("Rater","Clip"))
d <- d[ d$value==1, ]
A <- d[d$Rater=="A",] 
B <- d[d$Rater=="B",]
A <- data.frame( Clip=A$Clip, A=A$variable )
B <- data.frame( Clip=B$Clip, B=B$variable )
d <- merge(A, B, all=FALSE)
d <- ddply( d, c("A", "B"), summarize, n=length(Clip) )
dcast( d, A ~ B )

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM