简体   繁体   中英

How to calculate transition probabilities in R

I would like to calculate how often changes between values happen by person-year combination (panel data). This mimics Stata's command xttrans . The transition between index 6 and 7 should not be included, since it is not a transition from within one person.

df = data.frame(id=c(1,1,1,1,1,1,1,2,2,2,2,2,2,2), 
                year=seq(from=2003, to=2009, by=1), 
                health=c(3,1,2,2,5,1,1,1,2,3,2,1,1,2))

在此处输入图像描述

Here is a base R solution to calculate transition counts by id groups:

with(df, do.call(`+`, tapply(health, id, function(x){
  x <- factor(x, levels = min(health, na.rm = T):max(health, na.rm = T))
  table(x[-length(x)], x[-1])
})))

#    1 2 3 4 5
#  1 2 3 0 0 0
#  2 1 1 1 0 1
#  3 1 1 0 0 0
#  4 0 0 0 0 0
#  5 1 0 0 0 0
library(tidyverse)

# Calculate the last health status for each id
df <- df %>% 
         group_by(id) %>% 
         mutate(lastHealth=lag(health)) %>%  
         ungroup()
# Count nunmber of existing transitions
transitions <- df %>% 
                  group_by(health, lastHealth) %>%  
                  summarise(N=n()) %>% 
                  ungroup()
# Fill in the transition grid to include possible transitions that weren't observed
transitions <- transitions %>% 
                 complete(health=1:5, lastHealth=1:5, fill=list(N=0))
# Present the transitions in the required format
transitions %>% 
  pivot_wider(names_from="health", values_from="N", names_prefix="health") %>%
  filter(!is.na(lastHealth))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM