简体   繁体   English

将大小不同的行变成列

[英]Turn different sized rows into columns

I am reading in a data file with many different rows, all of which can have different lengths like so: 我正在读取包含许多不同行的数据文件,所有这些行的长度都可以像这样:

dataFile <- read.table("file.txt", as.is=TRUE);

The rows can be as follows: 这些行可以如下所示:

1 5 2 6 2 1
2 6 24
2 6 1 5 2 7 982 24 6
25 2

I need the rows to be transformed into columns. 我需要将行转换为列。 I'll be then using the columns for a violin plot like so: 然后,我将这些列用于小提琴图,如下所示:

names(dataCol)[1] <- "x";
jpeg("violinplot.jpg", width = 1000, height = 1000);
do.call(vioplot,c(dataCol,))
dev.off()

I'm assuming there will be an empty string/placeholder for any column with fewer entries than the column with the maximum number of entries. 我假设任何条目少于具有最大条目数的列的列都将有一个空字符串/占位符。 How can it be done? 如何做呢?

Use the fill = TRUE argument in read.table . read.table使用fill = TRUE参数。 Then to change rows to columns, use t to transpose. 然后要将行更改为列,请使用t进行转置。 Using your data this would look like... 使用您的数据,看起来像...

df <- read.table( text = "1 5 2 6 2 1
2 6 24
2 6 1 5 2 7 982 24 6
25 2
" , header = FALSE , fill = TRUE )

df
#  V1 V2 V3 V4 V5 V6  V7 V8 V9
#1  1  5  2  6  2  1  NA NA NA
#2  2  6 24 NA NA NA  NA NA NA
#3  2  6  1  5  2  7 982 24  6
#4 25  2 NA NA NA NA  NA NA NA

t(df)
#   [,1] [,2] [,3] [,4]
#V1    1    2    2   25
#V2    5    6    6    2
#V3    2   24    1   NA
#V4    6   NA    5   NA
#V5    2   NA    2   NA
#V6    1   NA    7   NA
#V7   NA   NA  982   NA
#V8   NA   NA   24   NA
#V9   NA   NA    6   NA

EDIT: apparently read.table has a fill=TRUE option, which is WAYYYY easier than my answer. 编辑:显然read.table有一个fill=TRUE选项,比我的回答要容易WAYYYY。

I've never used vioplot before, and that seems like a weird way to make a function call (instead of something like vioplot(dataCol) ), but I have worked with ragged arrays before, so I'll try that. 我以前从未使用过vioplot,这似乎是进行函数调用的一种怪异方式(而不是像vioplot(dataCol)类的东西),但是我之前使用过vioplot(dataCol)数组,所以我会尝试一下。

Have you read the data in yet? 你读过数据了吗? That tends to be the hardest part. 这往往是最困难的部分。 The code below reads the above data from a file called temp.txt into a matrix called out2 下面的代码从名为temp.txt的文件中将上述数据读取到名为out2的矩阵中

file = 'temp.txt'
dat = readChar(file,file.info(file)$size)
split1 = strsplit(dat,"\n")
split2 = strsplit(split1[[1]]," ")
n = max(unlist(lapply(split2,length)))
out=matrix(nrow=n,ncol=length(split2))
tFun = function(i){
    vect = as.numeric(split2[[i]])
    length(vect)=n
    out[,i]=vect
}
out2 = sapply(1:length(split2),tFun)

I'll try and explain what I've done: the first step is to read in every character via readChar . 我将尝试解释所做的事情:第一步是通过readChar读取每个字符。 You then split the lines, then the elements within each line to get the list split2 , where each element of the list is a row of the input file. 然后拆分行,然后拆分每行中的元素以获取列表split2 ,其中列表中的每个元素都是输入文件的一行。

From there you create a blank matrix that would be the right size for your data, then iterate through the list and assign each element to a column. 从那里创建一个空白矩阵,该矩阵将适合您的数据大小,然后遍历列表并将每个元素分配给一列。

It's not pretty, but it works! 它不漂亮,但是可以用!

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM