简体   繁体   中英

R line plot for repeated measures with line thickness reflecting number of observations

Question 1: Lets assume I have carried out a "before-after" (repeated measures with two points in time) experiment with 100 subjects. Each subject ticks a score on a 1 to 3 numerical scale in the 'before' condition (at T1) and again, after some treatment applied, in the 'after' condition (at T2). The behavior of each subject in the experiment can be described as 'transition from the score value at T1 to score value at T2'. Eg from 2 to 3, or from 1 to 1, or from 3 to 1 and so on... The cartesian product tells us that 9 different transition types are theoretically possible. For each transition type, I calculated (in an external program) the count of observations. This gives the following dataframe:

MyData1 <- data.frame(TransitionTypeID=seq(1:9), T1=c(1,1,1,2,2,2,3,3,3), T2=c(1,2,3,1,2,3,1,2,3), Count=c(2,14,0,18,12,8,23,12,11))
MyData1

For each score value (on the y-axis) I would like to plot a point and a line between T1 and T2 (on the x-axis), whereas the thickness of the line between T1 and T2 should (somehow) correspond to the count which is observed. The plot should simply visualize which transitions occur more often than others. Any hints?

Question 2: While some pre-calculation steps of the above example have been carried out outside of R (in MS Access), I believe there must exist a way reaching the desired result from within R, ie using a dataframe with individual records for each subject and each point in time (ie with two rows per subject, one for the score at T1 and one for T2, hence in the 'long' format). In that case the dataframe is something like this:

MyData2 <- data.frame(SubjectID=seq(1:100), Condition = c(rep("T1",100), rep("T2",100)), Score=floor(runif(200,min=1, max=4)))

library(ggplot2)

ggplot(data = MyData2, aes(x = Condition, y = Score, group = SubjectID)) +  geom_line()

I get a nice plot showing the observed transitions, but obviously the individual lines for each subject are just plotted on top of each other, ie the thicknesses between T1 and T2 do not reflect the count of observations for each type of transition. Again: hints on how to achieve meaningful line thicknesses would be highly appreciated.

Here is a possible solution for Question 1.

MyData1 <- data.frame(TransitionTypeID=seq(1:9), 
                      T1=c(1,1,1,2,2,2,3,3,3), 
                      T2=c(1,2,3,1,2,3,1,2,3), 
                      Count=c(2,14,0,18,12,8,23,12,11))
MyData1

df <- data.frame(x=rep(c("T1","T2"), each=nrow(MyData1)),
                 y=c(MyData1$T1,MyData1$T2),
                 ID=rep(MyData1$TransitionTypeID,2),
                 cnt=rep(MyData1$Count,2)
)
df_lb <- data.frame(x=rep(c("T1","T2"), each=3),
                     y=rep(1:3,2),
                     hj=rep(c(2,-1),each=3))

library(ggplot2)
pal <- colorRampPalette(c("white","blue","red"))
ggplot(data=df, aes(x=x, y=y, group=ID, size=cnt, color=cnt)) +
  geom_line() +
  geom_point(show.legend=F) +
  labs(x="", y="Score") +
  scale_color_gradientn(colours=pal(10)) +
  geom_text(data=df_lb, aes(x=x, y=y, label=y), size=7, inherit.aes=F, hjust=df_lb$hj) +
  theme_void()

在此处输入图像描述

For question #1

You get to your goal very tidily with geom_segment :

ggplot( data=cbind( MyData1 ), 
        aes(x=1, y=T1,xend=2, yend=T2 ,size=Count))+ 
            geom_segment()

在此处输入图像描述

I suspect you will need to change the 0 Counts to NA to get that 1->3 transition to go away. I think a size of 0 should make a segment disappear, but apparently Hadley thinks otherwise.

Yep:

 is.na(MyData1 ) <- MyData1==0
 ggplot( data=cbind( MyData1 ), aes(x=1, y=T1,xend=2, yend=T2 ,size=Count))+geom_segment()

The same code as above now delivers the correct plot.

在此处输入图像描述

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM