Suppose I have two vectors:
X1 <- c(44350, 38920, 37530, 42280, 37320, 36910, 35720, 31220, 33400, 40710, 43830, 37390, 32340, 30770, 35800, 40250, 31490, 40460, 33730, 35850, 35320, 37500, 35380, 40910, 29040, 33950)
X2 <- c(30390, 34170, 28910, 30660, 32510, 30540, 31990, 32380, 32110, 31260, 34670, 28240, 31840, 33350, 32150, 35640, 30730, 30280, 29420, 30990, 32880, 33280, 36960, 36990)
I am interested in counting all pairwise instances where a member of X1 < member of X2.
How would I do this in R for any two vectors?
Now, suppose I wish to combine both vectors, randomize, split into two vectors of length(X1) and length(X2) then count as above--creating a randomized distribution to compare against the initial count.
How would this be done?
Count of instances where a in X1 and b in X2 with a < b
(s <- sum(outer(X1,X2,`<`)))
# [1] 106
Combine the two
X <- c(X1,X2)
Shuffle, resplit, and retest many times
set.seed(1)
r <- replicate(1000000, {
X <- sample(X)
X1 <- head(X,length(X1))
X2 <- tail(X,length(X2))
sum(outer(X1,X2,`<`)) })
(p <- 1 - sum(s <= r)/length(r))
# [1] 1e-05
Note: It sounds like you might be looking for something like the Mann-Whitney test, which is under wilcox.test
.
> wilcox.test(X2,X1)
Wilcoxon rank sum test
data: X2 and X1
W = 106, p-value = 2.858e-05
alternative hypothesis: true location shift is not equal to 0
Compare W = 106
to s = 106
above.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.