I have a data frame that looks somewhat like this:
df <- data.frame(0:2, 1:3, 2:4, 5:7, 6:8, 2:4, 0:2, 1:3, 2:4)
colnames(df) <- rep(c('a', 'b', 'c'), 3)
> df
a b c a b c a b c
1 0 1 2 5 6 2 0 1 2
2 1 2 3 6 7 3 1 2 3
3 2 3 4 7 8 4 2 3 4
There are multiple columns that have the same name. I would like to rearrange the data frame so that the columns with the same names combine into their own supercolumn, so that there are only unique column names left, for example:
> df
a b c
1 0 1 2
2 1 2 3
3 2 3 4
4 5 6 2
5 6 7 3
6 7 8 4
7 0 1 2
8 1 2 3
9 2 3 4
Any thoughts on how to do this? Thanks in advance!
This will do the trick, I suppose.
Explanation
df[,names(df) == 'a']
will select all columns with name a
unlist
will convert above columns into 1 single vector
unname
will remove some stray rownames given to these vectors.
unique(names(df))
will give you unique column names in df
sapply
will apply the inline function to all values of unique(names(df))
> df
a b c a b c a b c
1 0 1 2 5 6 2 0 1 2
2 1 2 3 6 7 3 1 2 3
3 2 3 4 7 8 4 2 3 4
> sapply(unique(names(df)), function(x) unname(unlist(df[,names(df)==x])))
a b c
[1,] 0 1 2
[2,] 1 2 3
[3,] 2 3 4
[4,] 5 6 2
[5,] 6 7 3
[6,] 7 8 4
[7,] 0 1 2
[8,] 1 2 3
[9,] 2 3 4
My version:
library(reshape)
as.data.frame(with(melt(df), split(value, variable)))
a b c
1 0 1 2
2 1 2 3
3 2 3 4
4 0 1 2
5 1 2 3
6 2 3 4
7 0 1 2
8 1 2 3
9 2 3 4
In the step using melt
I transform the dataset:
> melt(df)
Using as id variables
variable value
1 a 0
2 a 1
3 a 2
4 b 1
5 b 2
6 b 3
7 c 2
8 c 3
9 c 4
10 a 0
11 a 1
12 a 2
13 b 1
14 b 2
15 b 3
16 c 2
17 c 3
18 c 4
19 a 0
20 a 1
21 a 2
22 b 1
23 b 2
24 b 3
25 c 2
26 c 3
27 c 4
Then I split up the value
column for each unique level of variable
using split
:
$a
[1] 0 1 2 0 1 2 0 1 2
$b
[1] 1 2 3 1 2 3 1 2 3
$c
[1] 2 3 4 2 3 4 2 3 4
then this only needs an as.data.frame
to become the data structure you need.
Use %in%
and some unlisting
zz <- lapply(unique(names(df)), function(x,y) as.vector(unlist(df[which(y %in% x)])),y=names(df))
names(zz) <- unique(names(df))
as.data.frame(zz)
a b c
1 0 1 2
2 1 2 3
3 2 3 4
4 5 6 2
5 6 7 3
6 7 8 4
7 0 1 2
8 1 2 3
9 2 3 4
I would sort the data.frame
by column name, unlist, and use as.data.frame
on a matrix
:
A <- unique(names(df))[order(unique(names(df)))]
B <- matrix(unlist(df[, order(names(df))], use.names=FALSE), ncol = length(A))
B <- setNames(as.data.frame(B), A)
B
# a b c
# 1 0 1 2
# 2 1 2 3
# 3 2 3 4
# 4 5 6 2
# 5 6 7 3
# 6 7 8 4
# 7 0 1 2
# 8 1 2 3
# 9 2 3 4
I'm not at the computer now, so can't test this, but.. . this might work:
do.call(cbind,
lapply(names(df) function(x) do.call(rbind, df[, names(df) == x])) )
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.