[英]R - Creating Scatter Plot from Data Frame
i've got a data frame all
that look like this: 我有一个数据帧all
看起来像这样:
http://pastebin.com/Xc1HEYyH http://pastebin.com/Xc1HEYyH
Now I want to create a scatter plot with the column headings in the x-axis and the respective values as the data points. 现在我想创建一个散点图,其中x轴的列标题和相应的值作为数据点。 For example: 例如:
7| x
6| x x
5| x x x x
4| x x x
3| x x
2| x x
1|
---------------------------------------
STM STM STM PIC PIC PIC
cold normal hot cold normal hot
This should be easy, but I can not figure out how. 这应该很容易,但我无法弄清楚如何。
Regards 问候
The basic idea, if you want to plot using Hadley's ggplot2
is to get your data of the form: 如果你想使用Hadley的ggplot2
进行绘图,基本的想法是获取表格的数据:
x y
col_names values
And this can be done by using melt
function from Hadley's reshape2
. 这可以通过使用Hadley's reshape2
melt
函数来完成。 Do ?melt
to see the possible arguments. 做?melt
以查看可能的论点。 However, here since we want to melt the whole data.frame, we just need, 但是,这里因为我们想要融化整个data.frame,我们只需要,
melt(all)
# this gives the data in format:
# variable value
# 1 STM_cold 6.0
# 2 STM_cold 6.0
# 3 STM_cold 5.9
# 4 STM_cold 6.1
# 5 STM_cold 5.5
# 6 STM_cold 5.6
Here, x
will be then column variable
and y
will be corresponding value
column. 这里, x
将是列variable
, y
将是对应的value
列。
require(ggplot2)
require(reshape2)
ggplot(data = melt(all), aes(x=variable, y=value)) +
geom_point(aes(colour=variable))
If you don't want the colours, then just remove aes(colour=variable)
inside geom_point so that it becomes geom_point()
. 如果你不想要颜色,那么只需删除geom_point中的aes(colour=variable)
,使其成为geom_point()
。
Edit: I should probably mention here, that you could also replace geom_point
with geom_jitter
that'll give you, well, jittered points: 编辑:我也许应该提到这里,你还可以取代geom_point
与geom_jitter
这会给你,好了,抖动点:
Here are two options to consider. 这里有两个选项需要考虑。 The first uses dotplot
from the "lattice" package: 第一个使用“lattice”包中的dotplot
:
library(lattice)
dotplot(values ~ ind, data = stack(all))
The second uses dotchart
from base R's "graphics" options. 第二个使用基础R的“图形”选项的dotchart
。 To use the dotchart
function, you need to wrap your data.frame
in as.matrix
: 要使用dotchart
功能,需要将data.frame
包装在as.matrix
:
dotchart(as.matrix(all), labels = "")
Note that the points in this graphic are not "jittered", but rather, presented in the order they were recorded. 请注意,此图形中的点不是 “抖动”,而是按记录顺序显示。 That is to say, the lowest point is the first record, and the highest point is the last record. 也就是说,最低点是第一个记录,最高点是最后一个记录。 If you zoomed into the plot for this example, you would see that you have 16 very faint horizontal lines. 如果你放大了这个例子的情节,你会发现你有16条非常微弱的水平线。 Each line represents one row from each column. 每行代表每列的一行。 Thus, if you look at the dots for "STM_cold" or any of the other variables that have NA
values, you'll see a few blank lines at the top where there was no data available. 因此,如果您查看“STM_cold”或任何其他具有NA
值的变量的点,您会在顶部看到一些空白行,其中没有可用数据。
This has its advantages since it might show a trend over time if the values are recorded chronologically, but might also be a disadvantage if there are too many rows in your source data frame. 这有其优点,因为如果按时间顺序记录值,它可能会显示随时间变化的趋势,但如果源数据框中的行太多,则可能也会有缺点。
A bit of a manual version using base R graphics just for fun. 一些使用基础R图形的手动版本只是为了好玩。
Get the data: 获取数据:
test <- read.table(text="STM_cold STM_normal STM_hot PIC_cold PIC_normal PIC_hot
6.0 6.6 6.3 0.9 1.9 3.2
6.0 6.6 6.5 1.0 2.0 3.2
5.9 6.7 6.5 0.3 1.8 3.2
6.1 6.8 6.6 0.2 1.8 3.8
5.5 6.7 6.2 0.5 1.9 3.3
5.6 6.5 6.5 0.2 1.9 3.5
5.4 6.8 6.5 0.2 1.8 3.7
5.3 6.5 6.2 0.2 2.0 3.5
5.3 6.7 6.5 0.1 1.7 3.6
5.7 6.7 6.5 0.3 1.7 3.6
NA NA NA 0.1 1.8 3.8
NA NA NA 0.2 2.1 4.1
NA NA NA 0.2 1.8 3.3
NA NA NA 0.8 1.7 3.5
NA NA NA 1.7 1.6 4.0
NA NA NA 0.1 1.7 3.7",header=TRUE)
Set up the basic plot: 设置基本情节:
plot(
NA,
ylim=c(0,max(test,na.rm=TRUE)+0.3),
xlim=c(1-0.1,ncol(test)+0.1),
xaxt="n",
ann=FALSE,
panel.first=grid()
)
axis(1,at=seq_along(test),labels=names(test),lwd=0,lwd.ticks=1)
Plot some points, with some x-axis jitter
ing so they are not printed on top of one another. 绘制一些点,一些x轴jitter
因此它们不会相互打印。
invisible(
mapply(
points,
jitter(rep(seq_along(test),each=nrow(test))),
unlist(test),
col=rep(seq_along(test),each=nrow(test)),
pch=19
)
)
Result: 结果:
Here's an example using alpha transparency on the points and getting rid of the jitter
as discussed in the below comments with Ananda. 这是一个使用点上的alpha透明度并消除jitter
的示例,如下面与Ananda的评论中所述。
invisible(
mapply(
points,
rep(seq_along(test),each=nrow(test)),
unlist(test),
col=rgb(0,0,0,0.1),
pch=15,
cex=3
)
)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.