简体   繁体   English

dplyr:将data_frame随机分为两个

[英]dplyr: Split data_frame into two randomly

How can I split a data_frame randomly into two without creating an index? 如何在不创建索引的情况下将data_frame随机分为两部分? sample_n works for me to get one part of it, but how can I collect the other part? sample_n可以帮助我获得其中一部分,但是如何收集另一部分呢?

You can do an anti_join with the extracted part as y-dataframe and the original as x-dataframe. 你可以做一个anti_join与提取的部分为y非数据帧和原为x非数据帧。 A small example: 一个小例子:

library(dplyr)

df <- data_frame(x=1:20,y=runif(20))
dfy <- df %>% sample_n(10, replace=FALSE)
dfx <- anti_join(df, dfy, by="x")

this results in the following dataframes: 这导致以下数据帧:

> df
Source: local data frame [20 x 2]

    x          y
1   1 0.64147504
2   2 0.35766839
3   3 0.44875782
4   4 0.01905876
5   5 0.85655599
6   6 0.88191481
7   7 0.46532067
8   8 0.09831802
9   9 0.31158184
10 10 0.39504048
11 11 0.81358862
12 12 0.41702158
13 13 0.80441008
14 14 0.69928890
15 15 0.19040897
16 16 0.94120853
17 17 0.65289448
18 18 0.46844427
19 19 0.63177479
20 20 0.58288923

the one half: 一半:

> dfx
Source: local data frame [10 x 2]

    x         y
1  19 0.6317748
2  17 0.6528945
3  16 0.9412085
4  15 0.1904090
5  14 0.6992889
6  11 0.8135886
7   7 0.4653207
8   6 0.8819148
9   5 0.8565560
10  3 0.4487578

the other half: 另一半:

> dfy
Source: local data frame [10 x 2]

    x          y
1  18 0.46844427
2   8 0.09831802
3  12 0.41702158
4   4 0.01905876
5   2 0.35766839
6  10 0.39504048
7  13 0.80441008
8   9 0.31158184
9   1 0.64147504
10 20 0.58288923

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在 R 中更新 dplyr 会出现“没有名为‘data_frame’的包”错误 - Updating dplyr in R gives “no package called ‘data_frame’” error dplyr:在增加data_frame的同时添加多个滞后 - dplyr: Add multiple lags while growing the data_frame dplyr:根据data_frame每行中的值下载文件 - dplyr: Downloading a file based on the values in every row of a data_frame 使用 dplyr 时出错:object 'data_frame' 不是由 'namespace:vctrs' 导出的 - Error using dplyr : object ‘data_frame’ is not exported by 'namespace:vctrs' 将data.frame转换为data_frame(dplyr)后使用Tapply-R - Using tapply after convert a data.frame to a data_frame (dplyr) - R 用data_frame()替换data.frame,用dplyr用bind_cols()替换cbind - Replacing data.frame by data_frame() and cbind by bind_cols() from dplyr 如何将dplyr :: data_frame转换为HTML表格以产生光泽? - How can I transform a dplyr::data_frame into an html table for shiny? 使用dplyr在data_frame的所有列中运行卡方检验 - Run chi-square test in all columns for a data_frame using dplyr 在dplyr :: mutate(没有tibble :: data_frame)中使用strsplit会引发“评估错误:非字符参数” - Using strsplit within dplyr::mutate (without tibble::data_frame) raises “Evaluation error: non-character argument” 在 data_frame 上执行数学函数 - performing mathematical functions across a data_frame
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM