简体   繁体   English

如何在R中绘制数据框的所有列

[英]How to plot all the columns of a data frame in R

The data frame has n columns and I would like to get n plots, one plot for each column.数据框有 n 列,我想得到 n 个图,每列一个图。

I'm a newbie and I am not fluent in R, anyway I found two solutions.我是新手,R 语言不太流利,反正我找到了两种解决方案。

The first one works but it does not print the column name (and I need them!):第一个有效,但它不打印列名(我需要它们!):

data <- read.csv("sample.csv",header=T,sep=",")
for ( c in data ) plot( c, type="l" )

The ggplot2 package takes a little bit of learning, but the results look really nice, you get nice legends, plus many other nice features, all without having to write much code. ggplot2包需要一点学习,但结果看起来非常好,你会得到漂亮的图例,加上许多其他不错的功能,所有这些都无需编写太多代码。

require(ggplot2)
require(reshape2)
df <- data.frame(time = 1:10,
                 a = cumsum(rnorm(10)),
                 b = cumsum(rnorm(10)),
                 c = cumsum(rnorm(10)))
df <- melt(df ,  id.vars = 'time', variable.name = 'series')

# plot on same grid, each series colored differently -- 
# good if the series have same scale
ggplot(df, aes(time,value)) + geom_line(aes(colour = series))

# or plot on different plots
ggplot(df, aes(time,value)) + geom_line() + facet_grid(series ~ .)

在此处输入图像描述在此处输入图像描述

There is very simple way to plot all columns from a data frame using separate panels or the same panel:有一种非常简单的方法可以使用单独的面板或同一面板绘制数据框中的所有列:

plot.ts(data)

You can jump through hoops and convert your solution to a lapply , sapply or apply call.您可以跳过箍并将您的解决方案转换为lapplysapplyapply调用。 (I see @jonw shows one way to do this.) Other than that what you have already is perfectly acceptable code. (我看到@jonw 展示了一种方法。)除此之外,您已经拥有的是完全可以接受的代码。

If these are all a time series or similar then the following might be a suitable alternative, which plots each series in it's own panel on a single plotting region.如果这些都是时间序列或类似的,那么以下可能是一个合适的替代方案,它将每个系列在它自己的面板中绘制在一个绘图区域上。 We use the zoo package as it handles ordered data like this very well indeed.我们使用zoo包,因为它确实很好地处理了这样的有序数据。

require(zoo)
set.seed(1)
## example data
dat <- data.frame(X = cumsum(rnorm(100)), Y = cumsum(rnorm(100)),
                  Z = cumsum(rnorm(100)))
## convert to multivariate zoo object
datz <- zoo(dat)
## plot it
plot(datz)

Which gives:这使:动物园绘图功能示例

I'm surprised that no one mentioned matplot .我很惊讶没有人提到matplot It's pretty convenient in case you don't need to plot each line in separate axes.如果您不需要在单独的轴上绘制每条线,这非常方便。 Just one command:只有一个命令:

matplot(y = data, type = 'l', lty = 1)

Use ?matplot to see all the options.使用?matplot查看所有选项。

To add the legend, you can set color palette and then add it:要添加图例,您可以设置调色板,然后添加它:

mypalette = rainbow(ncol(data))
matplot(y = data, type = 'l', lty = 1, col = mypalette)
legend(legend = colnames(data), x = "topright", y = "topright", lty = 1, lwd = 2, col = mypalette)

Using some of the tips above (especially thanks @daroczig for the names(df)[i] form) this function prints a histogram for numeric variables and a bar chart for factor variables.使用上面的一些技巧(特别感谢 @daroczig 的names(df)[i]形式),此函数打印数字变量的直方图和因子变量的条形图。 A good start to exploring a data frame:探索数据框的良好开端:

par(mfrow=c(3,3),mar=c(2,1,1,1)) #my example has 9 columns

dfplot <- function(data.frame)
{
  df <- data.frame
  ln <- length(names(data.frame))
  for(i in 1:ln){
    mname <- substitute(df[,i])
      if(is.factor(df[,i])){
        plot(df[,i],main=names(df)[i])}
        else{hist(df[,i],main=names(df)[i])}
  }
}

Best wishes, Mat.最良好的祝愿,马特。

With lattice :lattice

library(lattice)

df <- data.frame(time = 1:10,
                 a = cumsum(rnorm(10)),
                 b = cumsum(rnorm(10)),
                 c = cumsum(rnorm(10)))

form <- as.formula(paste(paste(names(df)[- 1],  collapse = ' + '),  
                         'time',  sep = '~'))

xyplot(form,  data = df,  type = 'b',  outer = TRUE)

Unfortunately, ggplot2 does not offer a way to do this (easily) without transforming your data into long format.不幸的是,ggplot2 没有提供一种方法来(轻松地)将您的数据转换为长格式。 You can try to fight it but it will just be easier to do the data transformation.您可以尝试与之抗争,但进行数据转换会更容易。 Here all the methods, including melt from reshape2, gather from tidyr, and pivot_longer from tidyr: Reshaping data.frame from wide to long format这里所有的方法,包括从 reshape2 melt ,从 tidyr gather ,以及从pivot_longer的 pivot_longer: 将 data.frame 从宽格式重塑为长格式

Here's a simple example using pivot_longer :这是一个使用pivot_longer的简单示例:

> df <- data.frame(time = 1:5, a = 1:5, b = 3:7)
> df
  time a b
1    1 1 3
2    2 2 4
3    3 3 5
4    4 4 6
5    5 5 7

> df_wide <- df %>% pivot_longer(c(a, b), names_to = "colname", values_to = "val")
> df_wide
# A tibble: 10 x 3
    time colname   val
   <int> <chr>   <int>
 1     1 a           1
 2     1 b           3
 3     2 a           2
 4     2 b           4
 5     3 a           3
 6     3 b           5
 7     4 a           4
 8     4 b           6
 9     5 a           5
10     5 b           7

As you can see, pivot_longer puts the selected column names in whatever is specified by names_to (default "name"), and puts the long values into whatever is specified by values_to (default "value").如您所见, pivot_longer将选定的列名放入由names_to (默认“名称”)指定的任何内容中,并将长值放入由values_to (默认“值”)指定的任何内容中。 If I'm ok with the default names, I can use use df %>% pivot_longer(c("a", "b")) .如果我对默认名称没问题,我可以使用 use df %>% pivot_longer(c("a", "b"))

Now you can plot as normal, ex.现在您可以正常绘图,例如。

ggplot(df_wide, aes(x = time, y = val, color = colname)) + geom_line()

在此处输入图像描述

You could specify the title (and also the title of the axes via xlab and ylab ) with the main option.您可以使用main选项指定标题(以及通过xlabylab轴的标题)。 Eg:例如:

plot(data[,i], main=names(data)[i])

And if you want to plot (and save) each variable of a dataframe, you should use png , pdf or any other graphics driver you need, and after that issue a dev.off() command.如果您想绘制(并保存)数据帧的每个变量,您应该使用pngpdf或您需要的任何其他图形驱动程序,然后发出dev.off()命令。 Eg:例如:

data <- read.csv("sample.csv",header=T,sep=",")
for (i in 1:length(data)) {
    pdf(paste('fileprefix_', names(data)[i], '.pdf', sep='')
    plot(data[,i], ylab=names(data[i]), type="l")
    dev.off()
}

Or draw all plots to the same image with the mfrow paramater of par() .或者使用par()mfrow参数将所有绘图绘制到同一图像上。 Eg: use par(mfrow=c(2,2) to include the next 4 plots in the same "image".例如:使用par(mfrow=c(2,2)将接下来的 4 个图包含在同一“图像”中。

I don't have R on this computer, but here is a crack at it.我在这台计算机上没有 R,但这里有一个破解。 You can use par to display multiple plots in a window, or like this to prompt for a click before displaying the next page.您可以使用par在一个窗口中显示多个图,或者像这样在显示下一页之前提示单击。

plotfun <- function(col) 
  plot(data[ , col], ylab = names(data[col]), type = "l")
par(ask = TRUE)
sapply(seq(1, length(data), 1), plotfun)

In case the column names in the .csv file file are not valid R name:如果.csv文件文件中的列名不是有效的 R 名称:

data <- read.csv("sample.csv",sep=";",head=TRUE)
data2 <- read.csv("sample.csv",sep=";",head=FALSE,nrows=1)

for ( i in seq(1,length( data ),1) ) plot(data[,i],ylab=data2[1,i],type="l")

This link helped me a lot for the same problem:这个链接对同样的问题帮助了我很多:

p = ggplot() + 
  geom_line(data = df_plot, aes(x = idx, y = col1), color = "blue") +
  geom_line(data = df_plot, aes(x = idx, y = col2), color = "red") 

print(p)

https://rpubs.com/euclid/343644 https://rpubs.com/euclid/343644

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM