I am trying to plot the following data frame, where there are 3 different time series (identified by user0, user1, and user2). Each row has a user identifier, the date, and a value.
> df
userId date steps
1 user0 2016-03-24 794
2 user0 2016-03-25 562
3 user0 2016-03-26 682
4 user0 2016-03-27 722
5 user0 2016-03-28 883
6 user1 2016-03-24 3642
7 user1 2016-03-25 3776
8 user1 2016-03-26 3585
9 user1 2016-03-27 3585
10 user1 2016-03-28 3471
11 user2 2016-03-24 5959
12 user2 2016-03-25 5933
13 user2 2016-03-26 5802
14 user2 2016-03-27 6094
15 user2 2016-03-28 5903
> dput(df)
structure(list(userId = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L), .Label = c("user0", "user1",
"user2"), class = "factor"), date = structure(c(16884, 16885,
16886, 16887, 16888, 16884, 16885, 16886, 16887, 16888, 16884,
16885, 16886, 16887, 16888), class = "Date"), steps = c(794L,
562L, 682L, 722L, 883L, 3642L, 3776L, 3585L, 3585L, 3471L, 5959L,
5933L, 5802L, 6094L, 5903L)), .Names = c("userId", "date", "steps"
), row.names = c(NA, -15L), class = "data.frame")
I would like plot all the time series (however many there are identified by the userId field) using different colors and with the date as the x-axis . I tried the following, but as you can see, the dates are repeated on the x-axis.
plot(df$steps, axes=F, xlab="", ylab="Steps", ylim=c(0,max(df$steps)))
axis(2)
axis(1, at = seq_along(df$date), labels = df$date, las = 2, cex.axis = 0.70)
box()
I looked at other postings, such as " Plot multiple lines (data series) each with unique color in R " and " Plotting multiple time series on the same plot using ggplot() ", but they do not have my problem of the time variable being mixed in with the other data.
A solution using color lines with and without ggplot would be greatly appreciated.
Here is a base R version:
plot(0, 0, type = "n", xlim = range(df$date), ylim = c(0, max(df$step)), axes = FALSE, xlab = "", ylab = "steps")
axis(2, las = 1)
axis(1, at = df$date, labels = df$date, las = 2, cex.axis = 0.70)
box()
cols <- c("red", "green", "blue")
for (i in 1:length(unique(df$userId)))
with(df[df$userId == unique(df$userId)[i], ], lines(date, steps, col = cols[i]))
With ggplot:
library(ggplot2)
ggplot(df, aes(x = date, y = steps, colour = userId)) + geom_line()
An equivalent (but still pretty ugly) base R version takes a lot more work:
plot(0, type = 'n', axes = FALSE, xlab = 'date', ylab = 'steps',
xlim = c(min(df$date), max(df$date)),
ylim = c(min(df$steps) - 100, max(df$steps) + 100))
axis.Date(1, df$date, format = '%F') # `axis.Date` is helpful here
axis(2, seq(0, max(df$steps + 500), 500))
box()
lapply(split(df, df$userId), function(x){lines(x$date, x$steps,
col = as.numeric(substr(x$userId, 5, 5)) + 1)})
# `paste` extra space to align legend correctly...*sigh*
legend('bottomright', paste(levels(df$userId), ' '), col = 1:3, lty = 1)
Note that it needs a good bit of fine-tuning.
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.