一個圖，多個時間序列，來自CSV文件和ggplot2

Question

我對R很新，並且無法弄清楚如何做到這一點，盡管有一些相似但不完全相同的問題浮出水面。 我有幾個（~10）CSV文件，如下所示：

time, value
0, 5
100, 4
200, 8
etc.

那是他們當時記錄了很長一段時間和價值觀。 我想使用ggplot2將它們全部繪制在R中的一個圖表上，這樣看起來就像這樣 。 我一直在嘗試各種融合和合並，到目前為止都沒有成功（雖然read.csv工作正常，我可以輕松地逐個繪制文件）。 我無法弄清楚的一件事是，是否要將所有數據合並到ggplot2之前，或者以某種方式將所有數據單獨傳遞給ggplot2。

我應該注意每個數據系列共享完全相同的時間點。 我的意思是，如果文件1的值為100,200,300，...，1000，那么所有其他文件也是如此。 但理想情況下，我希望解決方案不依賴於此，因為我可以看到未來的情況，其中時間類似地縮放但不完全相同，例如文件1具有時間99,202,302,399 ......和文件2有101,201,398,400，...

非常感謝。

編輯：我可以用這樣的常規plot （笨拙地）做到這一點，這可能說明了我想要做的事情：

f1 = read.csv("file1.txt")
f2 = read.csv("file2.txt")
f3 = read.csv("file3.txt")
plot(f1$time,f1$value,type="l",col="red")
lines(f2$time, f2$value, type="l",col="blue" )
lines(f3$time, f3$value, type="l",col="green" )

Answer 1

我會把它分成4個任務。 這也可以幫助尋找每個答案。

1. Reading a few files automatically, without harcoding the file names 
2. Merging these data.frame's , using a "left join"
3. Reshaping the data for ggplot2
4. Plotting a line graph

。

# Define a "base" data.frame
max_time = 600
base_df <- data.frame(time=seq(1, max_time, 1))

# Get the file names
all_files = list.files(pattern='.*csv')

# This reads the csv files, check if you need to make changes in read.csv
all_data <- lapply(all_files, read.csv)

# This joins the files, using the "base" data.frame
ls = do.call(cbind, lapply(all_data, function(y){
  df = merge(base_df, y, all.x=TRUE, by="time")
  df[,-1]
}))

# This would have the data in "wide" format
data = data.frame(time=base_df$time, ls)

# The plot
library(ggplot2)
library(reshape2)

mdf = melt(data, id.vars='time')
ggplot(mdf, aes(time, value, color=variable, group=variable)) +
  geom_line() +
  theme_bw()

Answer 2

# Creating fake data
fNames <- c("file1.txt", "file2.txt", "file3.txt")

write.csv(data.frame(time=c(1, 2, 4), value=runif(3)), file=fNames[1])
write.csv(data.frame(time=c(3, 4), value=runif(2)), file=fNames[2])
write.csv(data.frame(time=c(5), value=runif(1)), file=fNames[3])

這是我的嘗試，

fNames <- c("file1.txt", "file2.txt", "file3.txt")

allData <- do.call(rbind, # Read the data and combine into single data frame
               lapply(fNames,
                      function(f){
                        cbind(file=f, read.csv(f))
                      }))
require(ggplot2)
ggplot(allData)+
  geom_line(aes(x=time, y=value, colour=file)) # This way all series have a legend!

Answer 3

有四種方法可以做到這一點。

第一

您可以將所有數據合並到一個數據框中，然后分別繪制每一行。 以下是使用示例數據的代碼：

library(ggplot2)
library(reshape2)
data1 <- data.frame(time=1:200, series1=rnorm(200))
data2 <- data.frame(time=1:200, series2=rnorm(200))

mergeData <- merge(data1, data2, by="time", all=TRUE)

g1 <- ggplot(mergeData, aes(time, series1)) + geom_line(aes(color="blue")) + ylab("")
g2 <- g1 + geom_line(data=mergeData, aes(x=time, y=series2, color="red")) + guides(color=FALSE)
g2

第二

您可以融合合並的數據，然后使用單個ggplot代碼進行繪圖。 以下是代碼：

library(reshape2)
meltData <- melt(mergeData, id="time")
ggplot(meltData, aes(time, value, color=variable)) + geom_line()

第三這與您的編輯類似。 變量名稱應該相同。

library(ggplot2)
data1 <- data.frame(time=1:200, series1=rnorm(200))
data2 <- data.frame(time=1:200, series1=rnorm(200))

g1 <- ggplot(data1, aes(time, series1)) + geom_line(aes(color="blue")) + ylab("")
g2 <- g1 + geom_line(data=data2, aes(color="red")) + guides(color=FALSE)
g2

第四種方法：

這是執行任務的最通用方式，假設數量最少。這個方法並不假設變量名在每個數據集中都相同，但它會讓你編寫更多代碼（代碼中的錯誤變量名，會給出錯誤）。

library(ggplot2)

data1 <- data.frame(id=1:200, series1=rnorm(200))
data2 <- data.frame(id=1:200, series2=rnorm(200))

g1 <- ggplot() + geom_line(data=data1, aes(x=id, y=series1, color="red")) +
       geom_line(data=data2, aes(x=id, y=series2, color="blue")) + guides(color=FALSE)
g1

一個圖，多個時間序列，來自CSV文件和ggplot2

問題描述

3 個解決方案

解決方案1
3 2016-02-11 04:45:08

解決方案2
2 已采納 2016-02-11 07:12:40

解決方案3
0 2016-02-11 04:28:13

一個圖，多個時間序列，來自CSV文件和ggplot2

問題描述

3 個解決方案

解決方案1 3 2016-02-11 04:45:08

解決方案2 2 已采納 2016-02-11 07:12:40

解決方案3 0 2016-02-11 04:28:13

解決方案1
3 2016-02-11 04:45:08

解決方案2
2 已采納 2016-02-11 07:12:40

解決方案3
0 2016-02-11 04:28:13