如何将数据框列中的每个值移到其自己的列中？

Question

I am using R to construct and analyze a data set created from a Python script that a colleague has created which returns the following structure where 13 refers to the number of samples and 3128 is the number of observations of traits that are coded as a single digit(every single digit after the sample name represents a single column, the value encapsulating the coding for the trait): 我正在使用R来构建和分析由同事创建的Python脚本创建的数据集，该数据集返回以下结构，其中13表示样本数，3128是对特征进行观测的数量，这些数字被编码为个位数（样品名称后的每个数字代表一列，该值封装了特征的编码）：

13 3128
>1062_0    0000000000[...]
>1066A_0    000001010[...]
>1067A_0    000002010[...]
>1067B_0    110013010[...]
>1067C_0    000024010[...]
>1067D_0    000024010[...]
>1084A_0    200100010[...]
>1084B_0    001005110[...]
>1084C_0    000000010[...]
>1086_0    0100002100[...]
>1087_0    3002040100[...]
>1088_0    0000060111[...]
>C105_0    0000050120[...]

I am working to get these get these data into a data frame which has 13 rows and 3,128 columns. 我正在努力将这些数据获取到具有13行和3128列的数据框中。

I have used the read.phylip function of phylotools to read in this file above and can get it into a data.frame: 我已经使用了phylotools的read.phylip函数来读取上面的这个文件，并将其放入data.frame中：

SL_FFR_input <- read.phylip(fil = "matrix.phy")
SL_FFR_frame <- phy2dat(SL_FFR_input)

However, this results in a data frame of two columns, V1 being the sample names, and V2 being a string of all of the single digit codings. 但是，这导致两列的数据帧，V1是样本名称，V2是所有单位数字编码的字符串。

The frame that would be useful is shown below, where the sample names form the row names and each value now has its own column. 下面将显示有用的框架，其中样本名称构成行名称，并且每个值现在都有自己的列。

>1062_0     0 0 0 0 0 0 0 0 0[...]
>1066A_0    0 0 0 0 0 1 0 1 0[...]
>1067A_0    0 0 0 0 0 2 0 1 0[...]
>1067B_0    1 1 0 0 1 3 0 1 0[...]
>1067C_0    0 0 0 0 2 4 0 1 0[...]
>1067D_0    0 0 0 0 2 4 0 1 0[...]
>1084A_0    2 0 0 1 0 0 0 1 0[...]
>1084B_0    0 0 1 0 0 5 1 1 0[...]
>1084C_0    0 0 0 0 0 0 0 1 0[...]
>1086_0     0 1 0 0 0 0 2 1 0[...]
>1087_0     3 0 0 2 0 4 0 1 0[...]
>1088_0     0 0 0 0 0 6 0 1 1[...]
>C105_0     0 0 0 0 0 5 0 1 2[...]

It would be a huge help if someone could point me in the right direction! 如果有人可以指出正确的方向，那将是巨大的帮助！

Answer 1

I recommend dplyr + tidyr, it's possible to do this with strsplit and rbind, but it's ugly. 我建议使用dplyr + tidyr，可以使用strsplit和rbind进行此操作，但这很丑陋。

library(dplyr)
library(tidyr)
df1 <- data.frame(snames = c('a','b','c'),
                  digits = c('0000000000000',
                             '0000100000000',
                             '0000000001000'))
result <- df1 %>% separate(digits, paste0('X',1:13),sep = 1:12)

that will separate at the character positions 1:12 in the column, and name the columns X1 -> X13 它将在列中字符位置1:12处分开，并命名列X1-> X13

EDIT: for your case change the 13 to 3128, and the 12 to 3127, "digits" to whatever the name of your column is 编辑：对于您的情况，将13更改为3128，将12更改为3127，将“数字”更改为您列的名称

如何将数据框列中的每个值移到其自己的列中？

问题描述

1 个解决方案

解决方案1
0 已采纳 2015-10-12 03:34:56

如何将数据框列中的每个值移到其自己的列中？

问题描述

1 个解决方案

解决方案1 0 已采纳 2015-10-12 03:34:56

解决方案1
0 已采纳 2015-10-12 03:34:56