我可以在 r 中相互绘制子集的值吗？

Question

I have a data for a neurocognitive study.我有一个神经认知研究的数据。 We measure outcome by three slightly different surveys with the same range of possible points a participant can obtain.我们通过三个稍微不同的调查来衡量结果，参与者可以获得相同的可能点范围。 I have my data in long format – ie I have three rows for every participant and variables points and outcome .我的数据是长格式的——即每个参与者都有三行，变量points和outcome 。 Variable outcome indicates what type of the survey was used in a given row ( scd_gb , scd_rb or scd_ab ) for measuring points.可变outcome指示在给定行（ scd_gb 、 scd_rb或scd_ab ）中使用哪种类型的调查来测量点。

    id outcome points
    1  scd_gb   20
    1  scd_rb   15
    1  scd_ab   3
    2  scd_gb   6
    2  scd_rb   18
    2  scd_ab   15

I would like to create a scatter plot where I have scd_gb on the x axis and scd_gb & scd_rb on y axis, each with a different color.我想创建的散点图，其中我已经scd_gb在x轴上和scd_gb ＆ scd_rb在Y轴上，每个都有一个不同的颜色。

So I have two questions: First, can I plot subsets against each other or have I transform the data into wide format?所以我有两个问题：首先，我可以相互绘制子集还是将数据转换为宽格式？ Second (in general), can I plot one variable against two others?其次（一般而言），我可以将一个变量与另外两个变量进行对比吗？

I tried the following code that returns an error.我尝试了以下返回错误的代码。

    library(ggplot2)
    ggplot(SCD_long , aes(x = points(subset(SCD_long, outcome %in% c("scd_gb"))), 
    y = points(subset(SCD_long, outcome %in% c("scd_rb" , "scd_ab"))))) +
        geom_point(aes(color = outcome), alpha = .5)   

    Error: Aesthetics must be either length 1 or the same as the data (606): colour, x, y
In addition: Warning messages:
1: In data.matrix(x) : NAs introduced by coercion
2: In data.matrix(x) : NAs introduced by coercion
3: In data.matrix(x) : NAs introduced by coercion
4: In data.matrix(x) : NAs introduced by coercion

I think that both questions can be solved by data wrangling.我认为这两个问题都可以通过数据整理来解决。 I wonder if I can receive the required plot without changing the format of my data.我想知道是否可以在不更改数据格式的情况下接收所需的图。

Answer 1

Yes, I think making your data wide would be a good approach.是的，我认为扩大数据范围是一个好方法。 Here's one way to do that, which I've applied to a long format of a subset of the iris data frame.这是实现此目的的一种方法，我已将其应用于iris数据帧子集的长格式。

like_your_data <- structure(list(points = c(5.1, 4.9, 4.7, 4.6, 7, 6.4, 6.9, 5.5, 
6.3, 5.8, 7.1, 6.3), outcome = structure(c(1L, 1L, 1L, 1L, 2L, 
2L, 2L, 2L, 3L, 3L, 3L, 3L), .Label = c("setosa", "versicolor", 
"virginica"), class = "factor"), participant = c(1L, 2L, 3L, 
4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L)), class = c("tbl_df", "tbl", 
"data.frame"), row.names = c(NA, -12L), .Names = c("points", 
"outcome", "participant"))

First I make a version that is just setosa (equivalent to your scd_gb ).首先，我制作了一个只是 setosa 的版本（相当于您的scd_gb ）。 Then I join that to a version that excludes setosa.然后我将其加入到排除 setosa 的版本中。 This has the effect of adding the values for the others in one column, and their survey type in another.这具有将其他值添加到一列中，并将其调查类型添加到另一列中的效果。 This will work well with ggplot, as we can map the survey type to color.这将适用于 ggplot，因为我们可以将调查类型映射到颜色。

like_your_data %>%
  filter(outcome == "setosa") %>% # Equiv to scd_gb
  left_join(like_your_data %>% 
              filter(outcome != "setosa"), by = "participant") %>%
  ## Output at this point:
  # A tibble: 8 x 5
  #   points.x outcome.x participant points.y outcome.y 
  # <dbl> <fct>           <int>    <dbl> <fct>     
  # 1      5.1 setosa              1      7   versicolor
  # 2      5.1 setosa              1      6.3 virginica 
  # 3      4.9 setosa              2      6.4 versicolor
  ggplot(aes(points.x, points.y, color = outcome.y)) + geom_point()

我可以在 r 中相互绘制子集的值吗？

问题描述

1 个解决方案

解决方案1
0 已采纳 2018-10-19 00:42:52

我可以在 r 中相互绘制子集的值吗？

问题描述

1 个解决方案

解决方案1 0 已采纳 2018-10-19 00:42:52

解决方案1
0 已采纳 2018-10-19 00:42:52