简体   繁体   中英

compute weighted transition matrix in r

I found this question interesting: Transition matrix

So following his setup, suppose I add a weight (xt2) to each row:

 >df = data.frame(cusip = paste("A", 1:10, sep = ""), xt = c(1,2,3,2,3,5,2,4,5,5), xt1 = c(1,4,2,1,1,4,2,2,2,2),xt2=c(1:10));df
   cusip xt xt1 xt2
1     A1  1   1   1
2     A2  2   4   2
3     A3  3   2   3
4     A4  2   1   4
5     A5  3   1   5
6     A6  5   4   6
7     A7  2   2   7
8     A8  4   2   8
9     A9  5   2   9
10   A10  5   2  10

Using the answer in that post, we get the transition matrix:

res <- with(df, table(xt, xt1))
    xt1
 xt  1 2 4
   1 1 0 0
   2 1 1 1
   3 1 1 0
   4 0 1 0
   5 0 2 1
result <- res/rowSums(res) ;a
        xt1
 xt          1         2         4
   1 1.0000000 0.0000000 0.0000000
   2 0.3333333 0.3333333 0.3333333
   3 0.5000000 0.5000000 0.0000000
   4 0.0000000 1.0000000 0.0000000
   5 0.0000000 0.6666667 0.3333333

But what if I want to compute the transition matrix weighted by the xt2 column? That is to say, when we generate res , we do not just count the frequency of change of state, we use actual numbers (the weight). For example, res[2,1] should be 4, and res[5,2] should be 9+10=19. Therefore, the new res wanted should be like the following:

    xt1
 xt  1 2 4
   1 1 0 0
   2 4 7 2
   3 5 3 0
   4 0 8 0
   5 0 19 6

And then, we can just calculate result using the same code above. How can I achieve that res ? Thank you.

PS, Or is there any other way to "weight" the transition matrix?

We can use xtabs . Using the formula method, we specify the cross-classifying variables on the rhs of ~ and the vector of counts on the lhs. By default, it will do the sum

xtabs(xt2~xt+xt1, df)
#    xt1
#xt   1  2  4
#  1  1  0  0
#  2  4  7  2
#  3  5  3  0
#  4  0  8  0
#  5  0 19  6

Or with tapply , we group by 'xt', 'xt1' and specify the FUN as sum . For those elements that don't have a combination, it will give NA , which can be replaced to 0 if necessary.

with(df, tapply(xt2, list(xt, xt1), FUN=sum))

Or we can use acast from reshape2 . We reshape from 'long' to 'wide' by specifying the formula and the value.var column.

library(reshape2)
acast(df, xt~xt1, value.var='xt2', sum)

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM