如何避免由于 R 中的 matplot 中缺失值而导致的差距？

Question

我有一个 function 使用matplot到 plot 一些数据。 数据结构是这样的：

test = data.frame(x = 1:10, a = 1:10, b = 11:20)
matplot(test[,-1])
matlines(test[,1], test[,-1])

到目前为止，一切都很好。 但是，如果数据集中存在缺失值，则结果 plot 中存在间隙，我想通过连接间隙的边缘来避免这些间隙。

test$a[3:4] = NA
test$b[7] = NA
matplot(test[,-1])
matlines(test[,1], test[,-1])

在实际情况下，这是在 function 内，矩阵的维度更大，行数、列数和非重叠缺失值的 position 可能会在不同调用之间发生变化，所以我想找到一个解决方案可以灵活地处理这个问题。 我还需要使用matlines

我在想也许可以用内推数据填补空白，但也许有更好的解决方案。

Answer 1

我今天遇到了这种确切的情况，但我不想插入值 - 我只是想让线条“跨越间隙”，可以这么说。 我想出了一个解决方案，在我看来，它比插值更优雅，所以我想即使问题很老，我也会发布它。

导致差距的问题是连续值之间存在NA 。 所以我的解决方案是“移动”列值，以便没有NA间隙。 例如，由c(1,2,NA,NA,5)组成的列将变为c(1,2,5,NA,NA) 。 我在apply()循环中使用一个名为shift_vec_na()的函数来做到这一点。 x 值也需要调整，因此我们可以使用相同的原理将 x 值组成一个矩阵，但使用 y 矩阵的列来确定要移动哪些值。

下面是函数的代码：

# x -> vector
# bool -> boolean vector; must be same length as x. The values of x where bool 
#   is TRUE will be 'shifted' to the front of the vector, and the back of the
#   vector will be all NA (i.e. the number of NAs in the resulting vector is
#   sum(!bool))
# returns the 'shifted' vector (will be the same length as x)
shift_vec_na <- function(x, bool){
  n <- sum(bool)
  x[1:n] <- x[bool]
  x[(n + 1):length(x)] <- NA
  return(x)
}

# x -> vector
# y -> matrix, where nrow(y) == length(x)
# returns a list of two elements ('x' and 'y') that contain the 'adjusted'
# values that can be used with 'matplot()'
adj_data_matplot <- function(x, y){
  y2 <- apply(y, 2, function(col_i){
    return(shift_vec_na(col_i, !is.na(col_i)))
  })
  
  x2 <- apply(y, 2, function(col_i){
    return(shift_vec_na(x, !is.na(col_i)))
  })
  return(list(x = x2, y = y2))
}

然后，使用示例数据：

test <- data.frame(x = 1:10, a = 1:10, b = 11:20)
test$a[3:4] <- NA
test$b[7] <- NA
lst <- adj_data_matplot(test[,1], test[,-1])

matplot(lst$x, lst$y, type = "b")

Answer 2

您可以使用imputeTS包中的na.interpolation函数：

test = data.frame(x = 1:10, a = 1:10, b = 11:20)
test$a[3:4] = NA
test$b[7] = NA
matplot(test[,-1])
matlines(test[,1], test[,-1])

library('imputeTS')

test <- na.interpolation(test, option = "linear")
matplot(test[,-1])
matlines(test[,1], test[,-1])

Answer 3

今天也有同样的问题。 在我的上下文中，我不允许进行插值。 我在这里提供了一个最小但足够通用的工作示例来说明我所做的事情。 我希望它能帮助某人：

mymatplot <- function(data, main=NULL, xlab=NULL, ylab=NULL,...){
    #graphical set up of the window
    plot.new()
    plot.window(xlim=c(1,ncol(data)), ylim=range(data, na.rm=TRUE))
    mtext(text = xlab,side = 1, line = 3)
    mtext(text = ylab,side = 2, line = 3)
    mtext(text = main,side = 3, line = 0)
    axis(1L)
    axis(2L)
    #plot the data
    for(i in 1:nrow(data)){
        nin.na <- !is.na(data[i,])
        lines(x=which(nin.na), y=data[i,nin.na], col = i,...)
    }
}

核心“技巧”在x=which(nin.na)中。 它使线的数据点与 x 轴的索引一致。
台词

plot.new()  
plot.window(xlim=c(1,ncol(data)), ylim=range(data, na.rm=TRUE))  
mtext(text = xlab,side = 1, line = 3)  
mtext(text = ylab,side = 2, line = 3)  
mtext(text = main,side = 3, line = 0)  
axis(1L)  
axis(2L)`

绘制 window 的图形部分。range range(data, na.rm=TRUE)将 plot 调整为能够包含所有data点的适当大小。 mtext(...)用于 label 轴并提供主标题。 轴本身由axis(...)命令绘制。
-loop plots the data.以下循环绘制数据。
mymatplot 的 function head 为典型的mymatplot参数的可选通道提供了...参数， cex lty 、 plot 、 lwt等 via 。 这些将传递给lines 。
最后说一下 colors 的选择——它们完全符合您的口味。

如何避免由于 R 中的 matplot 中缺失值而导致的差距？

问题描述

3 个解决方案

解决方案1
1 2021-11-04 19:51:01

解决方案2
0 已采纳 2017-08-09 19:12:43

解决方案3
0 2022-04-21 09:51:35

如何避免由于 R 中的 matplot 中缺失值而导致的差距？

问题描述

3 个解决方案

解决方案1 1 2021-11-04 19:51:01

解决方案2 0 已采纳 2017-08-09 19:12:43

解决方案3 0 2022-04-21 09:51:35

解决方案1
1 2021-11-04 19:51:01

解决方案2
0 已采纳 2017-08-09 19:12:43

解决方案3
0 2022-04-21 09:51:35