[英]Can I plot values for subsets against each other in r?
I have a data for a neurocognitive study.我有一个神经认知研究的数据。 We measure outcome by three slightly different surveys with the same range of possible points a participant can obtain.
我们通过三个稍微不同的调查来衡量结果,参与者可以获得相同的可能点范围。 I have my data in long format – ie I have three rows for every participant and variables
points
and outcome
.我的数据是长格式的——即每个参与者都有三行,变量
points
和outcome
。 Variable outcome
indicates what type of the survey was used in a given row ( scd_gb
, scd_rb
or scd_ab
) for measuring points.可变
outcome
指示在给定行( scd_gb
、 scd_rb
或scd_ab
)中使用哪种类型的调查来测量点。
id outcome points
1 scd_gb 20
1 scd_rb 15
1 scd_ab 3
2 scd_gb 6
2 scd_rb 18
2 scd_ab 15
I would like to create a scatter plot where I have scd_gb
on the x axis and scd_gb
& scd_rb
on y axis, each with a different color.我想创建的散点图,其中我已经
scd_gb
在x轴上和scd_gb
& scd_rb
在Y轴上,每个都有一个不同的颜色。
So I have two questions: First, can I plot subsets against each other or have I transform the data into wide format?所以我有两个问题:首先,我可以相互绘制子集还是将数据转换为宽格式? Second (in general), can I plot one variable against two others?
其次(一般而言),我可以将一个变量与另外两个变量进行对比吗?
I tried the following code that returns an error.我尝试了以下返回错误的代码。
library(ggplot2)
ggplot(SCD_long , aes(x = points(subset(SCD_long, outcome %in% c("scd_gb"))),
y = points(subset(SCD_long, outcome %in% c("scd_rb" , "scd_ab"))))) +
geom_point(aes(color = outcome), alpha = .5)
Error: Aesthetics must be either length 1 or the same as the data (606): colour, x, y
In addition: Warning messages:
1: In data.matrix(x) : NAs introduced by coercion
2: In data.matrix(x) : NAs introduced by coercion
3: In data.matrix(x) : NAs introduced by coercion
4: In data.matrix(x) : NAs introduced by coercion
I think that both questions can be solved by data wrangling.我认为这两个问题都可以通过数据整理来解决。 I wonder if I can receive the required plot without changing the format of my data.
我想知道是否可以在不更改数据格式的情况下接收所需的图。
Yes, I think making your data wide would be a good approach.是的,我认为扩大数据范围是一个好方法。 Here's one way to do that, which I've applied to a long format of a subset of the
iris
data frame.这是实现此目的的一种方法,我已将其应用于
iris
数据帧子集的长格式。
like_your_data <- structure(list(points = c(5.1, 4.9, 4.7, 4.6, 7, 6.4, 6.9, 5.5,
6.3, 5.8, 7.1, 6.3), outcome = structure(c(1L, 1L, 1L, 1L, 2L,
2L, 2L, 2L, 3L, 3L, 3L, 3L), .Label = c("setosa", "versicolor",
"virginica"), class = "factor"), participant = c(1L, 2L, 3L,
4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L)), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -12L), .Names = c("points",
"outcome", "participant"))
First I make a version that is just setosa (equivalent to your scd_gb
).首先,我制作了一个只是 setosa 的版本(相当于您的
scd_gb
)。 Then I join that to a version that excludes setosa.然后我将其加入到排除 setosa 的版本中。 This has the effect of adding the values for the others in one column, and their survey type in another.
这具有将其他值添加到一列中,并将其调查类型添加到另一列中的效果。 This will work well with ggplot, as we can map the survey type to color.
这将适用于 ggplot,因为我们可以将调查类型映射到颜色。
like_your_data %>%
filter(outcome == "setosa") %>% # Equiv to scd_gb
left_join(like_your_data %>%
filter(outcome != "setosa"), by = "participant") %>%
## Output at this point:
# A tibble: 8 x 5
# points.x outcome.x participant points.y outcome.y
# <dbl> <fct> <int> <dbl> <fct>
# 1 5.1 setosa 1 7 versicolor
# 2 5.1 setosa 1 6.3 virginica
# 3 4.9 setosa 2 6.4 versicolor
ggplot(aes(points.x, points.y, color = outcome.y)) + geom_point()
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.