简体   繁体   中英

R - ggplot multiple regression lines for different columns in same chart

With data like below,

text = "
df1 = read.table(textConnection(text), sep=",", header = T)

Need to plot regression lines for columns as below with X-axis holding R_... values and Y-axis holding S_... values

  1. S_1700 vs. R_1700
  2. S_350 vs. R_350
  3. S_2950 vs. R_2950

For a single group of variables, I could have done something like below.

ggplot(df1, aes(x=R_1700, y=S_1700)) +
  geom_point() + 
  geom_smooth(method=lm, se=FALSE, fullrange=TRUE)

Need help to get all the three lines in a single plot as in the example below. The 3 different groups would be 1700 , 350 and 2950 .


If you could reorganize your data in a format like below:

# with data.table package
df2 <- melt(df1, measure.vars = patterns('R_', 'S_'))
df2[, variable := factor(variable, levels = 1:3,
    labels = tstrsplit(grep('R_', names(df1), value = TRUE), '_')[[2]])]
# > df2
#     variable  value1 value2
# 1:     1700      NA     NA
# 2:     1700  -80.00     NA
# 3:     1700  -77.55     NA
# 4:     1700  -75.55     NA
# 5:     1700  -80.80     NA
# 6:     1700  -80.80     NA
# 7:     1700  -80.80     NA
# 8:     1700  -73.80     NA
# 9:     1700  -72.80   3.70

# without data.table
tmp <- split.default(df1, f = sapply(strsplit(names(df1), '_'), `[`, 2))
tmp <- lapply(tmp, function(dtf){
    names(dtf) <- c('value1', 'value2')
df2 <- do.call(rbind, tmp)
df2$variable <- rep(names(tmp), each = nrow(df1))

you can visualize the data as desired easily:

ggplot(df2, aes(x = value1, y = value2, color = variable)) +
    geom_point() + 
    geom_smooth(method=lm, se=FALSE, fullrange=TRUE) +
    labs(x = 'R', y = 'S')


tidyverse solution


df1 %>% 
  pivot_longer(everything()) %>% #wide to long data format
  separate(name, c("key","number"), sep = "_") %>% #Separate elements like R_1700 into 2 columns 
  group_by(number, key) %>% #Group the vaules according to number, key
  mutate(row = row_number()) %>% #For creating unique IDs 
  pivot_wider(names_from = key, values_from = value) %>% #Make separate columns for R and S
  ggplot(aes(x=R, y=S, color = number, shape = number)) +
  geom_point() + 
  geom_smooth(method=lm, se=FALSE, fullrange=TRUE)


The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

粤ICP备18138465号  © 2020-2024 STACKOOM.COM