如果行包含 R 中同一变量的两个值，则转置数据帧

Question

I'm dealing with a dataframe that contains a variable named "Marker" which shows two values all the samples I collected.我正在处理一个数据框，其中包含一个名为“Marker”的变量，它显示了我收集的所有样本的两个值。 The dataframe is, for instance, as follows:例如，数据框如下：

Sample.File Sample.Name Marker value
1            a         a_1    xxx    16
2            a         a_1    xxx    18
3            a         a_1    yyy    16
4            a         a_1    yyy    20
5            a         a_1    zzz     9
6            a         a_1    zzz    13
7            b         b_1    xxx    10
8            b         b_1    xxx    10
9            b         b_1    yyy     6
10           b         b_1    yyy    12
11           b         b_1    zzz    14
12           b         b_1    zzz    14

which is provided by the following code:这是由以下代码提供的：

data <- data.frame(
   Sample.File = as.factor(c("a", "a", "a", "a", "a", "a", "b", "b", "b", "b",
                             "b", "b")),
   Sample.Name = as.factor(c("a_1", "a_1", "a_1", "a_1", "a_1", "a_1", "b_1",
                             "b_1", "b_1", "b_1", "b_1", "b_1")),
        Marker = as.factor(c("xxx", "xxx", "yyy", "yyy", "zzz", "zzz", "xxx",
                             "xxx", "yyy", "yyy", "zzz", "zzz")),
   value = c(16L, 18L, 16L, 20L, 9L, 13L, 10L, 10L, 6L, 12L, 14L, 14L)
)

The new dataframe I'd like to work with is should be achieved by transposing the current data, but maintaining the columns Sample.File and Sample.Name for all the collected samples.我想要使用的新数据框应该通过转置当前数据来实现，但为所有收集的样本维护 Sample.File 和 Sample.Name 列。 Furthermore, I'd like to obtain new variables to be labelled as follows (eg xxx & xxx.1, yyy & yyy.1, zzz & zzz.1) for the column labelled as "value".此外，我想为标记为“值”的列获得如下标记的新变量（例如 xxx & xxx.1、yyy & yyy.1、zzz & zzz.1）。

The table I'd like to achieve looks like the following:我想要实现的表如下所示：

  Sample.File Sample.Name xxx xxx.1 yyy yyy.1 zzz zzz.1
1           a         a_1  16    18  16    20   9    13
2           b         b_1  10    10   6    12  14    14

I'd like to use a code without writing the name of the labels reported into "Marker" column (since I could obtain up to 100 different labels).我想使用代码而不将报告的标签名称写入“标记”列（因为我可以获得多达 100 个不同的标签）。 I tried to use the following code but I couldn't achieve my goal:我尝试使用以下代码，但无法实现我的目标：

I tried to use the following code but I couldn't achieve my goal:我尝试使用以下代码，但无法实现我的目标：

library(dplyr)
library(tidyr)
data %>%
  gather(Sample.File, Sample.Name) %>%
  spread(value)

Error: `var` must evaluate to a single number or a column name, not a double vector
Run `rlang::last_error()` to see where the error occurred.
In addition: Warning message:
attributes are not identical across measure variables;
they will be dropped

I'd be very grateful if anybody could attend to this matter!如果有人能处理这件事，我将不胜感激！

Answer 1

Here is one way to do it.这是一种方法。 We can create an ID for each Marker and then create a column.我们可以为每个Marker创建一个 ID，然后创建一个列。 After that, we can convert it to wide format.之后，我们可以将其转换为宽格式。

library(dplyr)
library(tidyr)

data2 <- data %>%
  group_by_at(vars(-value)) %>%
  mutate(N = row_number() - 1) %>%
  unite(col = "Marker", Marker, N, sep = ".") %>%
  pivot_wider(names_from = "Marker", values_from = "value") %>%
  ungroup()
data2
# # A tibble: 2 x 8
#   Sample.File Sample.Name xxx.0 xxx.1 yyy.0 yyy.1 zzz.0 zzz.1
#   <fct>       <fct>       <int> <int> <int> <int> <int> <int>
# 1 a           a_1            16    18    16    20     9    13
# 2 b           b_1            10    10     6    12    14    14

如果行包含 R 中同一变量的两个值，则转置数据帧

问题描述

1 个解决方案

解决方案1
1 已采纳 2020-02-20 14:21:41

如果行包含 R 中同一变量的两个值，则转置数据帧

问题描述

1 个解决方案

解决方案1 1 已采纳 2020-02-20 14:21:41

解决方案1
1 已采纳 2020-02-20 14:21:41