简体   繁体   English

R逐行堆栈数据

[英]R stack data frame by rows

I am trying to do stack. 我正在尝试堆叠。 My data is 我的数据是

set.seed(1)
x<-runif(5)
y<-runif(5)
dat<-cbind(x,y)
dat<-as.data.frame(dat)
dat
   x          y
1 0.2655087 0.89838968
2 0.3721239 0.94467527
3 0.5728534 0.66079779
4 0.9082078 0.62911404
5 0.2016819 0.06178627


stack(dat)

   values ind
1  0.26550866   x
2  0.37212390   x
3  0.57285336   x
4  0.90820779   x
5  0.20168193   x
6  0.89838968   y
7  0.94467527   y
8  0.66079779   y
9  0.62911404   y
10 0.06178627   y

However, this stacks by column ie it takes y column and puts it below x . 但是,这按列堆叠,即它使用y列并将其放在x之下。 What I want to do is to stack it by row like this: 我想要做的是像这样逐行堆叠它:

0.2655087    x
0.89838968   y
0.3721239    x
0.94467527   y
0.5728534    x
0.66079779   y
0.9082078    x
0.62911404   y
0.2016819    x
0.06178627   y

How can this be done using stack ? 如何使用stack完成此操作?

Thanks 谢谢

A base R method that exploits the column dominant storage of matrices. 利用矩阵的列主导存储的基本R方法。 The columns x and y are turned into a matrix, which is transposed and then unwrapped into a vector. 将x和y列转换为矩阵,将其转置然后展开为向量。 Since we know the structure (ordering) of the resulting vector, we build the xy names into a new variable: 由于我们知道结果向量的结构(顺序),因此将xy名称构建到一个新变量中:

data.frame(values=c(t(data.matrix(dat))), ind=I(rep(colnames(dat), nrow(dat))))

Which returns 哪个返回

       values ind
1  0.26550866   x
2  0.89838968   y
3  0.37212390   x
4  0.94467527   y
5  0.57285336   x
6  0.66079779   y
7  0.90820779   x
8  0.62911404   y
9  0.20168193   x
10 0.06178627   y

I wrapped the xy vector in I to "insulate" it, so that it would return as a character vector within the data.frame function rather than as a factor, which is the default. 我将xy向量包装在I以对其进行“绝缘”,以便将其作为data.frame函数中的字符向量返回,而不是作为因子返回,这是默认值。 Using the stringsAsFactors=TRUE argument in data.frame would also return the xy vector as a character type. 使用stringsAsFactors = TRUE在参数data.frame也将返回在xy载体作为字符类型。

Why do you have to use stack()? 为什么必须使用stack()? This will do the trick: 这将达到目的:

# Creating your data frame
set.seed(1)
x<-runif(5)
y<-runif(5)
dat<-cbind(x,y)
dat<-as.data.frame(dat)
dat

# Stacking the data
dat2 <- rbind(data.frame("Value"=dat$x,"Ind"="x","Row"=seq(nrow(dat))),
      data.frame("Value"=dat$y,"Ind"="y","Row"=seq(nrow(dat))))

# Ordering the data
dat2 <- dat2[order(dat2$Row),setdiff(names(dat2),"Row")]

        Value Ind
1  0.26550866   x
6  0.89838968   y
2  0.37212390   x
7  0.94467527   y
3  0.57285336   x
8  0.66079779   y
4  0.90820779   x
9  0.62911404   y
5  0.20168193   x
10 0.06178627   y

We could simply do: 我们可以简单地做:

data.frame(values=matrix(t(dat)), ind=colnames(dat))

       # values ind
# 1  0.26550866   x
# 2  0.89838968   y
# 3  0.37212390   x
# 4  0.94467527   y
# 5  0.57285336   x
# 6  0.66079779   y
# 7  0.90820779   x
# 8  0.62911404   y
# 9  0.20168193   x
# 10 0.06178627   y

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM