[英]Plot multiple lines (data series) each with unique color in R
I am fairly new to R and I have the following queries :我对 R 相当陌生,我有以下疑问:
I am trying to generate a plot in R which has multiple lines (data series).我正在尝试在 R 中生成一个具有多行(数据系列)的图。 Each of these lines is a category and I want it to have a unique color.
每条线都是一个类别,我希望它具有独特的颜色。
Currently my code is setup in this way :目前我的代码是这样设置的:
First, I am creating an empty plot :首先,我正在创建一个空图:
plot(1,type='n',xlim=c(1,10),ylim=c(0,max_y),xlab='ID', ylab='Frequency')
Then for each of my category, I am plotting lines in this empty plot using a "for" loop like so :然后对于我的每个类别,我使用“for”循环在这个空图中绘制线条,如下所示:
for (category in categories){
lines(data.frame.for.this.category, type='o', col=sample(rainbow(10)), lwd=2)
}
There are 8 categories here, and so there are 8 lines produced in the plot.这里有 8 个类别,因此图中产生了 8 条线。 As you can see, I am trying to sample a color from the rainbows() function to generate a color for each line.
如您所见,我正在尝试从 Rainbows() 函数中采样一种颜色,以便为每条线生成一种颜色。
However, when the plot is generated, I find that there are multiple lines which have the same color.但是,当生成绘图时,我发现有多条线具有相同的颜色。 For instance, 3 of those 8 lines have green color.
例如,这 8 条线中有 3 条是绿色的。
How do I make each of these 8 lines have a unique color ?如何使这 8 条线中的每一条都具有独特的颜色?
Also, how do I reflect this uniqueness in the legend of the plot ?另外,我如何在情节的传说中反映这种独特性? I was trying to lookup the
legend()
function, however it was not clear which parameter I should use to reflect this unique color for each category ?我试图查找
legend()
函数,但是不清楚应该使用哪个参数来反映每个类别的这种独特颜色?
Any help or suggestions would be much appreciated.任何帮助或建议将不胜感激。
If your data is in wide format matplot
is made for this and often forgotten about:如果您的数据是宽格式
matplot
就是为此而制作的,并且经常被遗忘:
dat <- matrix(runif(40,1,20),ncol=4) # make data
matplot(dat, type = c("b"),pch=1,col = 1:4) #plot
legend("topleft", legend = 1:4, col=1:4, pch=1) # optional legend
There is also the added bonus for those unfamiliar with things like ggplot
that most of the plotting paramters such as pch
etc. are the same using matplot()
as plot()
.对于那些不熟悉
ggplot
东西的人来说,还有一个额外的好处,即大多数绘图参数(例如pch
等)都使用matplot()
和plot()
。
If you would like a ggplot2
solution, you can do this if you can shape your data to this format (see example below)如果您想要
ggplot2
解决方案,则可以执行此操作, ggplot2
是您可以将数据调整为此格式(请参见下面的示例)
# dummy data
set.seed(45)
df <- data.frame(x=rep(1:5, 9), val=sample(1:100, 45),
variable=rep(paste0("category", 1:9), each=5))
# plot
ggplot(data = df, aes(x=x, y=val)) + geom_line(aes(colour=variable))
You have the right general strategy for doing this using base graphics, but as was pointed out you're essentially telling R to pick a random color from a set of 10 for each line.您有使用基本图形执行此操作的正确一般策略,但正如所指出的那样,您实际上是在告诉 R 从每行 10 个颜色的集合中随机选择一种颜色。 Given that, it's not surprising that you will occasionally get two lines with the same color.
鉴于此,您偶尔会得到两条颜色相同的线条也就不足为奇了。 Here's an example using base graphics:
这是使用基本图形的示例:
plot(0,0,xlim = c(-10,10),ylim = c(-10,10),type = "n")
cl <- rainbow(5)
for (i in 1:5){
lines(-10:10,runif(21,-10,10),col = cl[i],type = 'b')
}
Note the use of type = "n"
to suppress all plotting in the original call to set up the window, and the indexing of cl
inside the for loop.请注意使用
type = "n"
来抑制原始调用中的所有绘图以设置窗口,以及 for 循环内cl
的索引。
More than one line can be drawn on the same chart by using the lines()
function使用
lines()
函数可以在同一张图表上绘制多lines()
# Create the data for the chart.
v <- c(7,12,28,3,41)
t <- c(14,7,6,19,3)
# Give the chart file a name.
png(file = "line_chart_2_lines.jpg")
# Plot the bar chart.
plot(v,type = "o",col = "red", xlab = "Month", ylab = "Rain fall",
main = "Rain fall chart")
lines(t, type = "o", col = "blue")
# Save the file.
dev.off()
I know, its old a post to answer but like I came across searching for the same post, someone else might turn here as well我知道,这是一个要回答的旧帖子,但就像我在搜索同一个帖子时遇到的一样,其他人也可能会转向这里
By adding : colour in ggplot function , I could achieve the lines with different colors related to the group present in the plot.通过在 ggplot 函数中添加 : color ,我可以实现与图中存在的组相关的具有不同颜色的线条。
ggplot(data=Set6, aes(x=Semana, y=Net_Sales_in_pesos, group = Agencia_ID, colour = as.factor(Agencia_ID)))
and和
geom_line()
Using @Arun dummy data :) here a lattice
solution :使用 @Arun 虚拟数据 :) 这里是一个
lattice
解决方案:
xyplot(val~x,type=c('l','p'),groups= variable,data=df,auto.key=T)
In addition to @joran's answer using the base plot
function with a for
loop, you can also use base plot
with lapply
:除了@joran 的答案使用带有
for
循环的 base plot
函数之外,您还可以将 base plot
与lapply
一起lapply
:
plot(0,0,xlim = c(-10,10),ylim = c(-10,10),type = "n")
cl <- rainbow(5)
invisible(lapply(1:5, function(i) lines(-10:10,runif(21,-10,10),col = cl[i],type = 'b')))
invisible
function simply serves to prevent lapply
from producing a list output in your console (since all we want is the recursion provided by the function, not a list).invisible
函数只是用来防止lapply
在您的控制台中生成列表输出(因为我们想要的只是函数提供的递归,而不是列表)。 As you can see, it produces the exact same result as using the for
loop approach.如您所见,它产生与使用
for
循环方法完全相同的结果。
So why use lapply
?那么为什么要使用
lapply
呢?
Though lapply
has been shown to perform faster/better than for
in R (eg, see here ; though see here for an instance where it's not), in this case it performs roughly about the same:虽然
lapply
已被证明比 R 中的for
执行得更快/更好(例如,请参阅此处;尽管请参阅此处查看不是的实例),但在这种情况下,它的执行情况大致相同:
Upping the number of lines to 50000 for both the lapply
and for
approaches took my system 46.3
and 46.55
seconds, respectively.正在增加行数50000两者的
lapply
和for
方法把我的系统46.3
和46.55
秒。
lapply
was just slightly faster, it was negligibly so.lapply
只是稍微快了一点,但可以忽略不计。 This speed difference might come in handy with larger/more complex graphing, but let's be honest, 50000 lines is probably a pretty good ceiling... So the answer to "why lapply
?": it's simply an alternative approach that works equally as well.所以“为什么
lapply
?”的答案:它只是一种同样有效的替代方法。 :) :)
Here is a sample code that includes a legend if that is of interest.这是一个示例代码,其中包含感兴趣的图例。
# First create an empty plot.
plot(1, type = 'n', xlim = c(xminp, xmaxp), ylim = c(0, 1),
xlab = "log transformed coverage", ylab = "frequency")
# Create a list of 22 colors to use for the lines.
cl <- rainbow(22)
# Now fill plot with the log transformed coverage data from the
# files one by one.
for(i in 1:length(data)) {
lines(density(log(data[[i]]$coverage)), col = cl[i])
plotcol[i] <- cl[i]
}
legend("topright", legend = c(list.files()), col = plotcol, lwd = 1,
cex = 0.5)
Here is another way to add lines using plot()
:这是使用
plot()
添加线条的另一种方法:
First, use function par(new=T)
首先,使用函数
par(new=T)
option:选项:
http://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_addat.html http://cran.r-project.org/doc/contrib/Lemon-kickstart/kr_addat.html
To color them differently you will need col()
.要以不同的方式为它们着色,您将需要
col()
。
To avoid superfluous axes descriptions use xaxt="n"
and yaxt="n"
for second and further plots.为了避免多余的轴描述,使用
xaxt="n"
和yaxt="n"
用于第二个和更多的绘图。
In case the x-axis is a factor / discrete variable, and one would like to keep the order of the variable (different values corresponding to different groups) to visualise the group effect.如果 x 轴是一个因子/离散变量,并且希望保持变量的顺序(不同的值对应不同的组)以可视化组效果。 The following code wold do:
以下代码将执行以下操作:
library(ggplot2)
set.seed(45)
# dummy data
df <- data.frame(x=rep(letters[1:5], 9), val=sample(1:100, 45),
variable=rep(paste0("category", 1:9), each=5))
# This ensures that x-axis (which is a factor variable) will be ordered appropriately
df$x <- ordered(df$x, levels=letters[1:5])
ggplot(data = df, aes(x=x, y=val, group=variable, color=variable)) + geom_line() + geom_point() + ggtitle("Multiple lines with unique color")
Also note that: adding group=variable remove the warning information: "geom_path: Each group consists of only one observation. Do you need to adjust the group aesthetic?"
另请注意:添加 group=variable 删除警告信息:“geom_path:每个组仅包含一个观察。您需要调整组美感吗?”
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.