简体   繁体   English

用R中的下拉线绘制速率

[英]Plot rates with drop-down lines in R

I haven't used stackoverflow in a very long time so please excuse me if this question is poorly laid out. 我已经很长时间没有使用stackoverflow了,所以如果这个问题的安排不好,请原谅。

I have 31 x-values which are the timestamps 1:45PM to 2:15PM inclusive, as well as different rates that occurred at those time stamps. 我有31个x值,分别是时间戳1:45 PM到2:15 PM(包括两端),以及在那些时间戳上发生的不同速率。

time    date.1  date.2
1:45    1.0063  1.005
1:46    1.0067  1.00576
1:47    1.0059  1.00559
1:48    1.00559 1.00532
1:49    1.0062  1.00599
1:50    1.0063  1.00622
1:51    1.005   1.00622
1:52    1.00576 1.00612
1:53    1.0066  1.00611
1:54    1.00532 1.00605
1:55    1.00599 1.00559
1:56    1.00622 1.0062
1:57    1.00612 1.00567
1:58    1.00611 1.00578
1:59    1.00605 1.00589
2:00    1.00559 1.00599
2:01    1.0062  1.00611
2:02    1.00567 1.00612
2:03    1.00578 1.00603
2:04    1.00589 1.00599
2:05    1.00599 1.00598
2:06    1.0062  1.00652
2:07    1.0063  1.00642
2:08    1.005   1.00641
2:09    1.00602 1.00635
2:10    1.00603 1.00589
2:11    1.00611 1.0065
2:12    1.00612 1.00597
2:13    1.00603 1.00608
2:14    1.00599 1.00619
2:15    1.00598 1.00629

I essentially want to plot all these values on the same chart and have drop down lines to the x-axis for each point. 我本质上是想将所有这些值绘制在同一张图表上,并在每个点的x轴上都有下拉线。 The reason I want to do this is because I want to visually see where the rates seem to be most of the time at each individual time stamp. 之所以要这样做,是因为我想直观地看到每个单独的时间戳上大多数时间的费率。

Right now, I am using: 现在,我正在使用:

plot(x,y1,type = "h")
par(new=TRUE)
plot(x,y2, type = "h")

This doesn't work if I allow x to be something like: 如果我允许x为以下内容,则此方法无效:

x <- c("1:45", "1:46", "1:47", "1:48", "1:49", "1:50", "1:51", "1:52", "1:53", "1:54", "1:55", "1:56", "1:57", "1:58", "1:59", "2:00", "2:01", "2:02", "2:03", "2:04", "2:05", "2:06", "2:07", "2:08", "2:09", "2:10", "2:11", "2:12", "2:13", "2:14", "2:15")

I think this isn't working because R isn't recognizing these as values so can't put an xlim on the plot? 我认为这是行不通的,因为R无法将它们识别为值,所以不能在情节上加上xlim吗?

I am also going to have several y values (probably 15+) where I plan to store them in excel. 我还将打算将几个y值(可能是15+)存储在excel中。 (i've just been copying from my clipboard right now) (我现在刚从剪贴板复制)

read.table("clipboard")

I would greatly appreciate anyone's help in showing me how to get the drop-down lines to work and how to get this to be a visually appealing plot! 非常感谢任何人的帮助,向我展示如何使下拉线正常工作以及如何使其成为吸引人的视觉情节!

EDIT: 编辑:

Thank you for both your answers. 谢谢您的回答。 I tried both of your suggestions and they both work perfectly. 我尝试了您的两个建议,它们都工作正常。 However, I understand why it might be a better idea to just look at the points rather looking at drop down lines, but I am still curious as to what a chart with the drop down lines would look like. 但是,我理解为什么只看点而不是看下拉线可能是一个更好的主意,但是我仍然对带有下拉线的图表的外观感到好奇。 I'd like them to be somewhat transparent like was suggested. 我希望它们像建议的那样透明。 This is the code i think i might go with simply because I think the for() loop will make it easier for me if there's a lot of variables. 这是我认为可以使用的代码,因为如果有很多变量,我认为for()循环会使我更轻松。

setwd("/Users/null/Desktop")
data <- read.table("Workbook2.csv", header=TRUE, sep=",")

#this will fix the issue of trying to use strings
data$x <- seq_along(data$time)

#Now plot them. Start by making an empty plot space that covers the full range
# of y values in your data set; then make a better x-axis; then plot the points, using
# alpha() from 'scales' to make the points transparent so overlapping ones show up.
with(data, plot(x = x, y = date.1, type = "n", xaxt="n", pch=20, bg=alpha("black", 0.5), col=alpha("black", 0.5),
          ylab = "Rate", xlab = "Time", ylim = c(min(data[,2:3]), max(data[,2:3]))))
axis(1, at=seq(length(data$x)), labels=data$time, tick=FALSE, las=2)
for (i in 1:2) points(x = data$x, y = data[,paste0("date.", i)], pch=20, col=alpha("black", 0.5))

Perhaps I should consider also adding a line across all the data that shows the average rate in the variables for each time stamp? 也许我应该考虑在所有数据上添加一条线,以显示每个时间戳变量的平均速率?

You're trying to plot numbers against strings, and that's why it doesn't work. 您正在尝试针对字符串绘制数字,这就是为什么它不起作用的原因。

Suppose your data frame is called dz , then: 假设您的数据帧称为dz ,则:

dz$time<-as.POSIXct(dz$time,format="%H:%M")

will convert your strings to actual times, from where you can plot: 会将您的字符串转换为实际时间,从中可以进行绘制:

library(ggplot2)
ggplot(dz,aes(time,y1,color="blue"))+geom_point()+geom_point(aes(y=y2,color="red"))

If you want to visually determine the times at which the y1 , y2 variables changed drastically, then the use of geom_line instead of geom_point could be more appropriate. 如果要直观地确定y1y2变量的变化时间,那么使用geom_line代替geom_point可能更合适。

If you want to see how rates cluster (or not) at each time slice, I think you'd do better to use points rather than vertical lines, ideally with transparent colors so you can see where they overlap. 如果您想查看每个时间段的费率如何聚集(或不聚集),我认为最好使用点而不是垂直线,最好使用透明的颜色,这样您就可以看到它们重叠的位置。 Here's one way to do that in base R, using a toy version of your data set with five y series: 这是在基数R中使用具有五个y序列的玩具版本的玩具的一种方法:

# Make a toy data set with five y series
set.seed(1)
df <- data.frame(time = c(paste(rep(1, 15), seq(45, 59, 1), sep=":"), paste0(rep(2, 10), ":0",  seq(0, 9)), paste(rep(2, 5), seq(10, 15), sep=":")),
    y1 = rnorm(31, 1, 0.1), y2 = rnorm(31, 1, 0.1), y3 = rnorm(31, 1, 0.1), y4 = rnorm(31, 1, 0.1), y5 = rnorm(31, 1, 0.1), stringsAsFactors=FALSE)

# Make a sequence along your time series to use as your x values; as the previous
# answer said, you can't use raw strings as x values for plotting.
df$x <- seq_along(df$time)

# Now plot them. Start by making an empty plot space that covers the full range
# of y values in your data set; then make a better x-axis; then plot the points, using
# alpha() from 'scales' to make the points transparent so overlapping ones show up.
with(df, plot(x = x, y = y1, type = "n", xaxt="n", pch=20, bg=alpha("black", 0.5), col=alpha("black", 0.5),
    ylab = "y value", xlab = "time", ylim = c(min(df[,2:6]), max(df[,2:6]))))
axis(1, at=seq(length(df$x)), labels=df$time, tick=FALSE, las=2)
for (i in 1:5) points(x = df$x, y = df[,paste0("y", i)], pch=20, col=alpha("black", 0.5))

Here's what I get with that. 这就是我得到的。 It will be more effective with data that aren't randomly generated, but I think you get the idea: 对于不是随机生成的数据,它会更有效,但是我想您会明白的: 在此处输入图片说明

And if you want vertical lines to make it easier to tie the dots to the time stamps, add something like abline(v = df$x, col = "gray85", lwd=0.5) before you plot the points. 并且,如果您想使用垂直线使点更容易地与时间戳联系起来,请在绘制点之前添加abline(v = df$x, col = "gray85", lwd=0.5)之类的内容。 Here's what that produces: 这就是产生的结果: 在此处输入图片说明

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM