简体   繁体   English

在R中对数据的行或列进行分组

[英]Grouping rows or columns of data in R

I'm trying to import some data into R and not having much luck grouping together rows of related data. 我正在尝试将一些数据导入到R中并且没有太多运气将相关数据行组合在一起。

Example: There a set of problems such as {A, B, C, D}. 示例:存在一组问题,例如{A,B,C,D}。 Each problem has two variables of interest which are being measured: "x" and "y". 每个问题都有两个感兴趣的变量:“x”和“y”。 Each variable is analysed in terms of some simple statistics: min, max, mean, stddev. 根据一些简单的统计数据分析每个变量:min,max,mean,stddev。

So, my input data has the form: 所以,我的输入数据有以下形式:

      Min  Max  Mean  StdDev
A
  x   3    10   6.6   2.1 
  y   2    5    3.2   1.7
B
  x   3    10   6.6   2.1 
  y   2    5    3.2   1.7
C
  x   3    10   6.6   2.1 
  y   2    5    3.2   1.7
D
  x   3    10   6.6   2.1 
  y   2    5    3.2   1.7

Is there any way to preserve the structure of this data in R? 有没有办法在R中保留这些数据的结构? A similar problem is creating groups of columns (flip the table by 90 degrees to the right for example). 类似的问题是创建列组(例如,将表向右翻转90度)。

You actually have many options: a data frame (relational table), or list. 您实际上有很多选项:数据框(关系表)或列表。 The following code will show how to create a data frame, and then to split it into a list containing the elements {x,y} or {A,B,C,D}: 以下代码将说明如何创建数据框,然后将其拆分为包含元素{x,y}或{A,B,C,D}的列表:

> txt <- "      Min  Max  Mean  StdDev
+ A
+   x   3    10   6.6   2.1 
+   y   2    5    3.2   1.7
+ B
+   x   3    10   6.6   2.1 
+   y   2    5    3.2   1.7
+ C
+   x   3    10   6.6   2.1 
+   y   2    5    3.2   1.7
+ D
+   x   3    10   6.6   2.1 
+   y   2    5    3.2   1.7
+ "
> 
> data <- head(readLines(textConnection(txt)),-1)
> fields <- strsplit(sub("^[ ]+","",data[!nchar(data)==1]),"[ ]+")
> DF <- `names<-`(data.frame(rep(data[nchar(data)==1],each=2), ## letters
+                            do.call(rbind,fields[-1])),       ## data
+                 c("Letter","xy",fields[[1]]))                ## colnames
> split(DF,DF$xy)
$x
  Letter xy Min Max Mean StdDev
1      A  x   3  10  6.6    2.1
3      B  x   3  10  6.6    2.1
5      C  x   3  10  6.6    2.1
7      D  x   3  10  6.6    2.1

$y
  Letter xy Min Max Mean StdDev
2      A  y   2   5  3.2    1.7
4      B  y   2   5  3.2    1.7
6      C  y   2   5  3.2    1.7
8      D  y   2   5  3.2    1.7

> split(DF,DF$Letter)
$A
  Letter xy Min Max Mean StdDev
1      A  x   3  10  6.6    2.1
2      A  y   2   5  3.2    1.7

$B
  Letter xy Min Max Mean StdDev
3      B  x   3  10  6.6    2.1
4      B  y   2   5  3.2    1.7

$C
  Letter xy Min Max Mean StdDev
5      C  x   3  10  6.6    2.1
6      C  y   2   5  3.2    1.7

$D
  Letter xy Min Max Mean StdDev
7      D  x   3  10  6.6    2.1
8      D  y   2   5  3.2    1.7

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM