具有不同标准偏差和每行平均值的 Numpy 数组

Question

I have a pandas data frame with two columns.我有一个包含两列的熊猫数据框。 They represent the mean and the standard deviation.它们代表平均值和标准偏差。

How can I perform vectorized sampling?如何执行矢量化采样？ I want to sample 1 observation per row.我想每行采样 1 个观察值。

import numpy as np
import pandas as pd

rng = np.random.RandomState(0)

#n_points = 4_000_000
n_points = 10
d_dimensions = 2

X = rng.random_sample((n_points, d_dimensions))

df = pd.DataFrame(X)
display(df.head())

df['raondomized'] = df.apply(lambda x: np.random.normal(x[0], x[1], 1), axis = 1)
df.head()

It is very slow when the number of records increases.当记录数增加时，速度很慢。

Numpy array with different standard deviation per row 每行具有不同标准偏差的Numpy数组
np.random.seed(444) arr = np.random.normal(loc=0., scale=[1., 2., 3.], size=(1000, 3)).T print(arr.mean(axis=1)) # [-0.06678394 -0.12606733 -0.04992722] print(arr.std(axis=1)) # [0.99080274 2.03563299 3.01426507]

show how to perform vectorized sampling with equal means - how can this be changed to support different means just like my naive version using apply , but faster?展示如何以相同的方式执行矢量化采样 - 如何将其更改为支持不同的方式，就像我使用apply天真版本一样，但速度更快？

A: A：

np.random.normal(df[0], df[1], 1)

only returns a single scalar value, even though multiple means/standard deviations are specified.即使指定了多个均值/标准差，也仅返回单个标量值。

Answer 1

df['raondomized'] = np.random.normal(df[0], df[1])

重要的是不要指定元素的数量。

Answer 2

How about怎么样

np.random.normal(df[0], df[1], len(df))

You can also specify how many run per specification (say 1000),您还可以指定每个规范的运行次数（例如 1000），

np.random.normal(df[0], df[1], (1000, len(df)))

具有不同标准偏差和每行平均值的 Numpy 数组

问题描述

2 个解决方案

解决方案1
1 已采纳 2020-02-05 21:43:14

解决方案2
1 2020-02-05 21:43:17

具有不同标准偏差和每行平均值的 Numpy 数组

问题描述

2 个解决方案

解决方案1 1 已采纳 2020-02-05 21:43:14

解决方案2 1 2020-02-05 21:43:17

解决方案1
1 已采纳 2020-02-05 21:43:14

解决方案2
1 2020-02-05 21:43:17