简体   繁体   English

如何将数据框列中的每个值移到其自己的列中?

[英]How can I move each value in a data frame column into its own column?

I am using R to construct and analyze a data set created from a Python script that a colleague has created which returns the following structure where 13 refers to the number of samples and 3128 is the number of observations of traits that are coded as a single digit(every single digit after the sample name represents a single column, the value encapsulating the coding for the trait): 我正在使用R来构建和分析由同事创建的Python脚本创建的数据集,该数据集返回以下结构,其中13表示样本数,3128是对特征进行观测的数量,这些数字被编码为个位数(样品名称后的每个数字代表一列,该值封装了特征的编码):

13 3128
>1062_0    0000000000[...]
>1066A_0    000001010[...]
>1067A_0    000002010[...]
>1067B_0    110013010[...]
>1067C_0    000024010[...]
>1067D_0    000024010[...]
>1084A_0    200100010[...]
>1084B_0    001005110[...]
>1084C_0    000000010[...]
>1086_0    0100002100[...]
>1087_0    3002040100[...]
>1088_0    0000060111[...]
>C105_0    0000050120[...]

I am working to get these get these data into a data frame which has 13 rows and 3,128 columns. 我正在努力将这些数据获取到具有13行和3128列的数据框中。

I have used the read.phylip function of phylotools to read in this file above and can get it into a data.frame: 我已经使用了phylotools的read.phylip函数来读取上面的这个文件,并将其放入data.frame中:

SL_FFR_input <- read.phylip(fil = "matrix.phy")
SL_FFR_frame <- phy2dat(SL_FFR_input)

However, this results in a data frame of two columns, V1 being the sample names, and V2 being a string of all of the single digit codings. 但是,这导致两列的数据帧,V1是样本名称,V2是所有单位数字编码的字符串。

The frame that would be useful is shown below, where the sample names form the row names and each value now has its own column. 下面将显示有用的框架,其中样本名称构成行名称,并且每个值现在都有自己的列。

>1062_0     0 0 0 0 0 0 0 0 0[...]
>1066A_0    0 0 0 0 0 1 0 1 0[...]
>1067A_0    0 0 0 0 0 2 0 1 0[...]
>1067B_0    1 1 0 0 1 3 0 1 0[...]
>1067C_0    0 0 0 0 2 4 0 1 0[...]
>1067D_0    0 0 0 0 2 4 0 1 0[...]
>1084A_0    2 0 0 1 0 0 0 1 0[...]
>1084B_0    0 0 1 0 0 5 1 1 0[...]
>1084C_0    0 0 0 0 0 0 0 1 0[...]
>1086_0     0 1 0 0 0 0 2 1 0[...]
>1087_0     3 0 0 2 0 4 0 1 0[...]
>1088_0     0 0 0 0 0 6 0 1 1[...]
>C105_0     0 0 0 0 0 5 0 1 2[...] 

It would be a huge help if someone could point me in the right direction! 如果有人可以指出正确的方向,那将是巨大的帮助!

I recommend dplyr + tidyr, it's possible to do this with strsplit and rbind, but it's ugly. 我建议使用dplyr + tidyr,可以使用strsplit和rbind进行此操作,但这很丑陋。

library(dplyr)
library(tidyr)
df1 <- data.frame(snames = c('a','b','c'),
                  digits = c('0000000000000',
                             '0000100000000',
                             '0000000001000'))
result <- df1 %>% separate(digits, paste0('X',1:13),sep = 1:12)

that will separate at the character positions 1:12 in the column, and name the columns X1 -> X13 它将在列中字符位置1:12处分开,并命名列X1-> X13

EDIT: for your case change the 13 to 3128, and the 12 to 3127, "digits" to whatever the name of your column is 编辑:对于您的情况,将13更改为3128,将12更改为3127,将“数字”更改为您列的名称

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 如何将 for 循环中的元组列表添加到每个元组 object 都在其自己的列中的数据框中? - How to add list of tuples in a for loop to data frame where each tuple object is in its own column? 在 python 中,如何将数据框中的特定值替换为其列均值? - In python, how can I replace a specific value in a data frame with its column mean? 如果列元素是一个集合,如何从 pandas 数据框列中获取每个值的计数? - How do I get count of each value from a pandas Data Frame column if the column elements is a set? 如何用列值替换pandas数据框中的每个值? - How to replace each value in pandas data frame with column value? 如何根据数据框中另一列的值将数据输入到新列中? - How can I enter data into a new column based on the value of another column in a data frame? 更改 pandas 数据框以从 pandas 中的数据框添加最大列 -&gt; 一年中每个月的最大值。 我怎样才能做到这一点? - Change pandas data frame to add max column -> maximum value for each month of the year from a data frame in pandas. How can I do this? 如何将数据框的列标题复制到每个行值? - How to copy the column headers of a data frame to each row value? 如何将数据框中的列与 Pandas 中第二个数据框中具有相同名称/位置的列中的值进行比较? - How can I compare a column in a data frame by a value in a column with the same name/place in a second dataframe in Pandas? 如何根据年-周为数据框中的每一列执行词袋模型? - How can I perform bag of words model for each column in the data frame according to the Year-Week? 如何将每个 Pandas Data Frame 行转换为包含列值作为属性的对象? - How can I convert each Pandas Data Frame row into an object including the column values as the attributes?
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM