简体   繁体   English

按行将 dataframe 拆分为 R

[英]split dataframe in R by row

I have a long dataframe like this:我有一个像这样的长 dataframe:

  Row  Conc   group
  1     2.5    A
  2     3.0    A
  3     4.6    B
  4     5.0    B
  5     3.2    C
  6     4.2    C
  7     5.3    D
  8     3.4    D

... ...

The actual data have hundreds of row.实际数据有数百行。 I would like to split A to C, and D. I looked up the web and found several solutions but not applicable to my case.我想将 A 拆分为 C 和 D。我查找了 web 并找到了几种解决方案但不适用于我的情况。

How to split a data frame? 如何拆分数据框?

For example: Case 1:例如: 案例 1:

x = data.frame(num = 1:26, let = letters, LET = LETTERS)
set.seed(10)
split(x, sample(rep(1:2, 13)))

I don't want to split by arbitrary number我不想按任意数字拆分

Case 2: Split by level/factor案例 2:按水平/因素拆分

data2 <- data[data$sum_points == 2500, ]

I don't want to split by a single factor either.我也不想被一个因素分开。 Sometimes I want to combine many levels together.有时我想将多个级别组合在一起。

Case 3: select by row number案例 3:select 按行号

newdf <- mydf[1:3,]

The actual data have hundreds of rows.实际数据有数百行。 I don't know the row number.我不知道行号。 I just know the level I would like to split at.我只知道我想要拆分的级别。

It sounds like you want two data frames, where one has (A,B,C) in it and one has just D . 听起来你想要两个数据帧,其中一个有(A,B,C) ,一个只有D In that case you could do 在那种情况下你可以做到

Data1 <- subset(Data, group %in% c("A","B","C"))
Data2 <- subset(Data, group=="D")

Correct me if you were asking something different 如果你问的不同,请纠正我

For those who end up here through internet search engines time after time, the answer to the question in the title is: 对于那些一次又一次通过互联网搜索引擎来到这里的人来说,标题中问题的答案是:

x <- data.frame(num = 1:26, let = letters, LET = LETTERS)

split(x, sort(as.numeric(rownames(x))))

Assuming that your data frame has numerically ordered row names. 假设您的数据框具有数字排序的行名称。 Also split(x, rownames(x)) works, but the result is rearranged. split(x, rownames(x))有效,但结果重新排列。

You may consider using the recode() function from the "car" package. 您可以考虑使用“car”包中的recode()函数。

# Load the library and make up some sample data
library(car)
set.seed(1)
dat <- data.frame(Row = 1:100,
                  Conc = runif(100, 0, 10),
                  group = sample(LETTERS[1:10], 100, replace = TRUE))

Currently, dat$group contains the upper case letters A to J. Imagine we wanted the following four groups: 目前, dat$group包含大写字母A到J.想象一下,我们想要以下四组:

  • "one" = A, B, C “一”= A,B,C
  • "two" = D, E, J “两个”= D,E,J
  • "three" = F, I “三”= F,我
  • "four" = G, H “四”= G,H

Now, use recode() (note the semicolon and the nested quotes). 现在,使用recode() (注意分号和嵌套引号)。

recodes <- recode(dat$group, 
                 'c("A", "B", "C") = "one"; 
                  c("D", "E", "J") = "two"; 
                  c("F", "I") = "three"; 
                  c("G", "H") = "four"')
split(dat, recodes)

With base R, we can input the factor that we want to split on.使用基数 R,我们可以输入我们想要拆分的因子。

split(df, df$group == "D")

Output Output

$`FALSE`
  Row Conc group
1   1  2.5     A
2   2  3.0     A
3   3  4.6     B
4   4  5.0     B
5   5  3.2     C
6   6  4.2     C

$`TRUE`
  Row Conc group
7   7  5.3     D
8   8  3.4     D

If you wanted to split on multiple letters, then we could:如果你想拆分多个字母,那么我们可以:

split(df, df$group %in% c("A", "D"))

Another option is to use group_split from dplyr , but will need to make a grouping variable first for the split.另一种选择是使用dplyr中的group_split ,但需要先为拆分创建一个分组变量。

library(dplyr)

df %>% 
  mutate(spl = ifelse(group == "D", 1, 0)) %>% 
  group_split(spl, .keep = FALSE)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM