简体   繁体   中英

Plot spectroscopic data (a matrix) with ggplot2 in R

I need to plot a spectroscopic data matrix, which rows are grouped by 2 factor variables, using ggplot2 (or lattice ) package in R as it has faceting capabilities.

Consider having a data frame DS with spectroscopic data (a matrix) DS$NIR from pls package:

library(pls)
data(gasoline)
DS <-gasoline

Let's add some grouping variables:

set.seed(0)
DS$Type <- as.factor(sample(c("Training set","Validation set","Others"),
                            nrow(DS),
                            replace = TRUE))

DS$Group <- cut(DS$octane,
                   breaks = c(80,86,88,90),
                   labels = c("Low","Medium","High"))

and look at data:

str(DS)

'data.frame':   60 obs. of  4 variables:
 $ octane: num  85.3 85.2 88.5 83.4 87.9 ...
 $ NIR   : AsIs [1:60, 1:401] -0.050193 -0.044227 -0.046867 -0.046705 -0.050859 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr  "1" "2" "3" "4" ...
  .. ..$ : chr  "900 nm" "902 nm" "904 nm" "906 nm" ...
 $ Type  : Factor w/ 3 levels "Others","Training set",..: 1 2 3 3 1 2 1 1 3 3 ...
 $ Group : Factor w/ 3 levels "Low","Medium",..: 1 1 3 1 2 1 3 3 3 3 ...

I need to plot every row of DS$NIR as a separate line. X axis values can be extracted by:

x <- as.numeric(gsub(" nm", "", dimnames(DS$NIR)[[2]]))
  1. The line colors should depend on levels of factor Group .
  2. The lines should be semi-transparent.
  3. Every color group (ie every level of factor Group ) should have a solid opaque line, which indicates the average (or median) of the group.
  4. Every level of factor Type should be plotted in a separate facet.

I found an example , how spectroscopic data is plotted, but yet it is too difficult for me to understand and adapt the code to my case.

You have two main problems with your data before plotting.

First, the NIR column is some weird matrix thing that doesn't play nicely with other functions. Let's fix that:

DS <- cbind(DS, as.data.frame(unclass(DS$NIR)))
DS$NIR <- NULL

Now, the data is wide, rather than long. Let's fix that with some dplyr and tidyr :

library(dplyr)
library(tidyr)    
graphdat <- DS %>% mutate(row = row_number()) %>%
                   gather(nm, value, -octane, -Type, -Group, -row) %>% 
                   mutate(nm = extract_numeric(nm))

Now it's easy to plot:

library(ggplot2)
ggplot(graphdat, aes(x = nm, y = value, group = row, color = Group)) + 
    geom_line() +
    facet_grid(Type~.)

在此处输入图片说明

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM