是否有等效于 R 的 sample() 函数的 Python？

Question

I want to know if Python has an equivalent to the sample() function in R.我想知道 Python 是否与 R 中的sample()函数等效。

The sample() function takes a sample of the specified size from the elements of x using either with or without replacement. sample()函数使用带替换或不带替换从 x 的元素中获取指定大小的样本。

The syntax is:语法是：

sample(x, size, replace = FALSE, prob = NULL)

(More information here ) （更多信息在这里）

Answer 1

I think numpy.random.choice(a, size=None, replace=True, p=None) may well be what you are looking for.我认为numpy.random.choice(a, size=None, replace=True, p=None)很可能就是你要找的。

The p argument corresponds to the prob argument in the sample() function. p参数对应于sample()函数中的prob参数。

Answer 2

In pandas (Python's closest analogue to R) there are the DataFrame.sample and Series.sample methods, which were both introduced in version 0.16.1.在熊猫（Python的最接近的类似物至R）有所述DataFrame.sample和Series.sample方法，这正是在0.16.1版中引入。

For example:例如：

>>> df = pd.DataFrame({'a': [1, 2, 3, 4, 5], 'b': [6, 7, 8, 9, 0]})
>>> df
   a  b
0  1  6
1  2  7
2  3  8
3  4  9
4  5  0

Sampling 3 rows without replacement:无替换采样 3 行：

>>> df.sample(3)
   a  b
4  5  0
1  2  7
3  4  9

Sample 4 rows from column 'a' with replacement, using column 'b' as the corresponding weights for the choices:从带有替换的列 'a' 中抽取 4 行样本，使用列 'b' 作为选项的相应权重：

>>> df['a'].sample(4, replace=True, weights=df['b'])
3    4
0    1
0    1
2    3

These methods are almost identical to the R function, allowing you to sample a particular number of values - or fraction of values - from your DataFrame/Series, with or without replacement.这些方法几乎与 R 函数相同，允许您从 DataFrame/Series 中采样特定数量的值 - 或值的一部分，有或没有替换。 Note that the prob argument in R's sample() corresponds to weights in the pandas methods.请注意，R 的sample()中的prob参数对应于 pandas 方法中的weights 。

Answer 3

I believe that the random package works.我相信random包有效。 Specifically random.sample().特别是 random.sample()。

here这里

Answer 4

Other answers here are great, but I'd like to mention an alternative from Scikit-Learn that we can also use for this, see this link .这里的其他答案很棒，但我想提一下 Scikit-Learn 的替代方案，我们也可以使用它，请参阅此链接。

Something like this:像这样的东西：

resample(np.arange(1,100), n_samples=100, replace=True,random_state=2)

Gives you this:给你这个：

[41 16 73 23 44 83 76  8 35 50 96 76 86 48 64 32 91 21 38 40 68  5 43 52
 39 34 59 68 70 89 69 47 71 96 84 32 67 81 53 77 51  5 91 64 80 50 40 47
  9 51 16  9 18 23 74 58 91 63 84 97 44 33 27  9 77 11 41 35 61 10 71 87
 71 20 57 83  2 69 41 82 62 71 98 19 85 91 88 23 44 53 75 73 91 92 97 17
 56 22 44 94]

是否有等效于 R 的 sample() 函数的 Python？

问题描述

4 个解决方案

解决方案1
35 已采纳 2015-12-03 22:15:13

解决方案2
9 2015-12-03 22:15:58

解决方案3
0 2015-12-03 22:17:10

解决方案4
0 2021-01-27 01:47:11

是否有等效于 R 的 sample() 函数的 Python？

问题描述

4 个解决方案

解决方案1 35 已采纳 2015-12-03 22:15:13

解决方案2 9 2015-12-03 22:15:58

解决方案3 0 2015-12-03 22:17:10

解决方案4 0 2021-01-27 01:47:11

解决方案1
35 已采纳 2015-12-03 22:15:13

解决方案2
9 2015-12-03 22:15:58

解决方案3
0 2015-12-03 22:17:10

解决方案4
0 2021-01-27 01:47:11