R 中的压缩数据帧

Question

I just have a simple question, I really appreciate everyones input, you have been a great help to my project.我只是有一个简单的问题，非常感谢大家的意见，你们对我的项目帮助很大。 I have an additional question about data frames in R.我还有一个关于 R 中的数据帧的问题。

I have data frame that looks similar to something like this:我有看起来类似于这样的数据框：

    C <- c("","","","","","","","A","B","D","A","B","D","A","B","D")
    D <- c(NA,NA,NA,2,NA,NA,1,1,4,2,2,5,2,1,4,2)
    G <- list(C=C,D=D)
    T <- as.data.frame(G)
    T
   C  D
1     NA
2     NA
3     NA
4     2
5     NA
6     NA
7     1
8  A  1
9  B  4
10 D  2
11 A  2
12 B  5
13 D  2 
14 A  1
15 B  4
16 D  2

I would like to be able to condense all the repeat characters into one, and look similar to this:我希望能够将所有重复字符压缩成一个，并且看起来类似于：

So of course, the data is all the same, it is just that it is condensed and new columns are formed to hold the data.所以当然，数据都是一样的，只是它被压缩并形成了新的列来保存数据。 I am sure there is an easy way to do it, but from the books I have looked through, I haven't seen anything for this!我相信有一个简单的方法可以做到这一点，但从我看过的书中，我没有看到任何关于这个的东西！

EDIT I edited the example because it wasn't working with the answers so far.编辑我编辑了这个例子，因为到目前为止它没有与答案一起工作。 I wonder if the NA's, blanks, and unevenness from the blanks are contributing??我想知道空白中的 NA、空白和不均匀是否有影响？

Answer 1

This seems to get the results you are looking for.这似乎得到了你正在寻找的结果。 I'm assuming it's OK to remove the NA values since that matches the desired output you show.我假设可以删除NA值，因为它与您显示的所需 output 匹配。

T <- na.omit(T)
T$ind <- ave(1:nrow(T), T$C, FUN = seq_along)
reshape(T, direction = "wide", idvar = "C", timevar = "ind")
#    C D.1 D.2 D.3
# 4      2   1  NA
# 8  A   1   2   1
# 9  B   4   5   4
# 10 D   2   2   2

library(reshape2)
dcast(T, C ~ ind, value.var = "D", fill = "")
#   C 1 2 3
# 1   2 1  
# 2 A 1 2 1
# 3 B 4 5 4
# 4 D 2 2 2

Answer 2

here´sa reshape solution:这是重塑解决方案：

require(reshape)
cast(T, C ~ ., function(x) x)

Answer 3

Changed T to df to avoid a bad habit.将 T 更改为 df 以避免坏习惯。 Returns a list, which my not be what you want but you can convert from there.返回一个列表，这不是您想要的，但您可以从那里转换。

C <- c("A","B","D","A","B","D","A","B","D")
D <- c(1,4,2,2,5,2,1,4,2)
my.df <- data.frame(id=C,val=D)

ret <- function(x) x
by.df <- by(my.df$val,INDICES=my.df$id,ret)

R 中的压缩数据帧

问题描述

3 个解决方案

解决方案1
1 2014-07-14 18:01:45

解决方案2
1 2011-07-10 00:06:01

解决方案3
1 2011-07-10 00:15:26

R 中的压缩数据帧

问题描述

3 个解决方案

解决方案1 1 2014-07-14 18:01:45

解决方案2 1 2011-07-10 00:06:01

解决方案3 1 2011-07-10 00:15:26

解决方案1
1 2014-07-14 18:01:45

解决方案2
1 2011-07-10 00:06:01

解决方案3
1 2011-07-10 00:15:26