如何基于R中一列的值对表进行子集/拆分？

Question

可以在以下位置找到数据： https : nlschools =0 ，或者在图书馆MASS中作为nlschools 。

我想根据nlschools $ SES的值拆分该表，将表划分为nlschools$SES <=30 30 < SES <= 40和> 40 ，并保留所有列。

我已经尝试使用间隔为0:30类的cut，但是结果非常令人困惑，并且没有完整的列集。

我希望我所要实现的目标已经足够清楚地描述了。

Answer 1

尝试这个：

indx <- with(nlschools,cut(SES, c(-Inf, 30, 40, Inf)))
lst <- split(nlschools, indx)

lapply(lst, head,2)
#$`(-Inf,30]`
#  lang   IQ class GS SES COMB
#1   46 15.0   180 29  23    0
#2   45 14.5   180 29  10    0

#$`(30,40]`
#  lang   IQ class GS SES COMB
#37   39 11.0  1082 25  33    1
#39   43 10.5  1280 31  33    1

#$`(40, Inf]`
#  lang IQ class GS SES COMB
#49   31  9  1280 31  50    1
#71   45 15  1880 28  50    0

如果需要将其作为单独的数据集：

list2env(setNames(lst, c("sesLOW", "sesMED", "sesHIGH")), envir=.GlobalEnv)
# <environment: R_GlobalEnv>


head(sesLOW,3)
#  lang   IQ class GS SES COMB.
#1   46 15.0   180 29  23    0
#2   45 14.5   180 29  10    0
#3   33  9.5   180 29  15    0

用@Ujjwal的帖子检查结果

identical(sesLOW, one)
#[1] TRUE

identical(sesMED, two)
#[1] TRUE

identical(sesHIGH, three)
#[1] TRUE

但是，列表中的所有分析/计算要比单独的数据集容易得多。 即使您可以使用lapply和write.table/write.csv等单独保存列表元素

更新

如果要在list创建新列

names(lst) <- c("low","med", "high")#no need to rename the list elements though. You can directly use it as a vector in the `Map`
lst2 <- Map(function(x, y) {x[,"SEScat"] <- y;x }, lst, names(lst))
lapply(lst2, head,2)
#$low
#  lang   IQ class GS SES COMB SEScat
#1   46 15.0   180 29  23    0    low
#2   45 14.5   180 29  10    0    low

#$med
#  lang   IQ class GS SES COMB SEScat
#37   39 11.0  1082 25  33    1    med
#39   43 10.5  1280 31  33    1    med

#$high
#  lang IQ class GS SES COMB SEScat
#49   31  9  1280 31  50    1   high
#71   45 15  1880 28  50    0   high

Answer 2

尝试：

one<-subset(nlschools, nlschools$SES <=30)
two<-subset(nlschools, nlschools$SES >30 &  nlschools$SES<=40)
three<-subset(nlschools, nlschools$SES >40)

Answer 3

为了回应您对@akrun的评论，请尝试：

> ddf$SEScat = with(ddf, ifelse(SES<=30,'low', ifelse(SES<=40, 'med', 'high')))
> ll = split(ddf, ddf$SEScat)

> head(ll[[1]])
      X lang   IQ class GS SES COMB SEScat
49   49   31  9.0  1280 31  50    1   high
71   71   45 15.0  1880 28  50    0   high
82   82   47 12.0  1880 28  50    0   high
85   85   33 13.0  1880 28  50    0   high
90   90   31 10.5  1880 28  50    0   high
145 145   50 13.5  2680 21  45    0   high
> head(ll[[2]])
  X lang   IQ class GS SES COMB SEScat
1 1   46 15.0   180 29  23    0    low
2 2   45 14.5   180 29  10    0    low
3 3   33  9.5   180 29  15    0    low
4 4   46 11.0   180 29  23    0    low
5 5   20  8.0   180 29  10    0    low
6 6   30  9.5   180 29  10    0    low
> head(ll[[3]])
    X lang   IQ class GS SES COMB SEScat
37 37   39 11.0  1082 25  33    1    med
39 39   43 10.5  1280 31  33    1    med
40 40   25  8.5  1280 31  33    1    med
42 42   41 11.0  1280 31  37    1    med
45 45   21  9.5  1280 31  40    1    med
52 52   29  8.5  1280 31  40    1    med

如何基于R中一列的值对表进行子集/拆分？

问题描述

3 个解决方案

解决方案1
2 2014-10-07 11:57:13

更新

解决方案2
1 已采纳 2014-10-07 11:55:19

解决方案3
1 2014-10-07 12:58:59

如何基于R中一列的值对表进行子集/拆分？

问题描述

3 个解决方案

解决方案1 2 2014-10-07 11:57:13

更新

解决方案2 1 已采纳 2014-10-07 11:55:19

解决方案3 1 2014-10-07 12:58:59

解决方案1
2 2014-10-07 11:57:13

解决方案2
1 已采纳 2014-10-07 11:55:19

解决方案3
1 2014-10-07 12:58:59