如何根据R中的预先指定的列对数据框进行分组

Question

I have a data that looks like this: 我有一个看起来像这样的数据：

 library(zoo)
 dt <- read.csv("http://dpaste.com/1612639/plain/",header=FALSE,fill=FALSE,na.strings = "")
 dt <- na.locf(dt)


> dt
   V1  V2                V3                V4       V5
1 FOO yyy Unigene126925_All Unigene137063_All 0.238087
2 FOO yyy Unigene126925_All  Unigene24551_All 0.374231
3 FOO yyy Unigene126925_All  Unigene31835_All 0.367897
4 BAR xxx Unigene126925_All Unigene165366_All 0.247844
5 BAR xxx Unigene126925_All Unigene111784_All 0.344493

What I want to do is to group them based on V1, the content of each group is a data-frame with values from V3 up to V5 of the above. 我要做的是基于V1将它们分组，每组的内容是一个数据帧，其值从上述的V3到V5。 It looks like this: 它看起来像这样：

Group FOO 集团FOO

     V1               V2                V3             
1 Unigene126925_All Unigene137063_All 0.238087
2 Unigene126925_All  Unigene24551_All 0.374231
3 Unigene126925_All  Unigene31835_All 0.367897

Group BAR BAR组

   V1               V2                V3   
1 Unigene126925_All Unigene165366_All 0.247844
2 Unigene126925_All Unigene111784_All 0.344493

How can I achieve that in R? 我怎样才能在R中实现这一目标？ Later for each group I will apply some function to its data frame. 稍后，对于每个组，我将对其数据框应用一些函数。

Answer 1

Use split : 使用split ：

> split(dt[, 3:5], dt$V1)
$BAR
                 V3                V4       V5
4 Unigene126925_All Unigene165366_All 0.247844
5 Unigene126925_All Unigene111784_All 0.344493

$FOO
                 V3                V4       V5
1 Unigene126925_All Unigene137063_All 0.238087
2 Unigene126925_All  Unigene24551_All 0.374231
3 Unigene126925_All  Unigene31835_All 0.367897

You may now run some function over this list and combine the results back with unsplit . 现在，您可以在此列表上运行某些功能，并将结果与unsplit合并在一起。

Answer 2

Use dplyr, it's very intuitive. 使用dplyr，它非常直观。

library(dplyr)
dt %.%
 group_by(V1) %.%
 summarise(newvar = function))

Where function is the function you want to apply, eg newvar = sum(V5) 函数是您要应用的函数，例如newvar = sum(V5)

Answer 3

如果我没记错的话（你可能想要将你需要的列的第一个参数切片，例如删掉“V1”）：

split(dt, dt$V1)

如何根据R中的预先指定的列对数据框进行分组

问题描述

3 个解决方案

解决方案1
3 已采纳 2014-02-13 10:13:21

解决方案2
3 2014-02-13 10:34:32

解决方案3
1 2014-02-13 10:14:30

如何根据R中的预先指定的列对数据框进行分组

问题描述

3 个解决方案

解决方案1 3 已采纳 2014-02-13 10:13:21

解决方案2 3 2014-02-13 10:34:32

解决方案3 1 2014-02-13 10:14:30

解决方案1
3 已采纳 2014-02-13 10:13:21

解决方案2
3 2014-02-13 10:34:32

解决方案3
1 2014-02-13 10:14:30