Python：從n維數組中定義的離散分布中采樣

Question

Python中是否有一個函數從n維numpy數組中采樣並返回每個繪制的索引。 如果沒有，那么如何定義這樣的功能呢？

例如：

>>> probabilities = np.array([[.1, .2, .1], [.05, .5, .05]])  
>>> print function(probabilities, draws = 10)
 ([1,1],[0,2],[1,1],[1,0],[0,1],[0,1],[1,1],[0,0],[1,1],[0,1])

我知道，使用一維數組可以通過多種方式解決此問題。 但是，我將處理大型的n維數組，不能僅僅為了繪制一次就對其進行重塑。

Answer 1

您可以使用np.unravel_index ：

a = np.random.rand(3, 4, 5)
a /= a.sum()

def sample(a, n=1):
    a = np.asarray(a)
    choices = np.prod(a.shape)
    index = np.random.choice(choices, size=n, p=a.ravel())
    return np.unravel_index(index, dims=a.shape)

>>> sample(a, 4)
(array([2, 2, 0, 2]), array([0, 1, 3, 2]), array([2, 4, 2, 1]))

這將返回一個數組元組，每個數組的維度為a ，每個數組的長度為請求的樣本數。 如果您希望有一個形狀數組(samples, dimensions) ，請將return語句更改為：

return np.column_stack(np.unravel_index(index, dims=a.shape))

現在：

>>> sample(a, 4)
array([[2, 0, 0],
       [2, 2, 4],
       [2, 0, 0],
       [1, 0, 4]])

Answer 2

如果數組在內存中是連續的，則可以就地更改數組的shape ：

probabilities = np.array([[.1, .2, .1], [.05, .5, .05]]) 
nrow, ncol = probabilities.shape
idx = np.arange( nrow * ncol ) # create 1D index

probabilities.shape = ( 6, ) # this is OK because your array is contiguous in memory

samples = np.random.choice( idx, 10, p=probabilities ) # sample in 1D
rowIndex = samples / nrow # convert to 2D
colIndex = samples % ncol

array([2, 0, 1, 0, 2, 2, 2, 2, 2, 0])
array([1, 1, 2, 0, 1, 1, 1, 1, 1, 1])

請注意，由於數組在內存中是連續的，因此reshape也會返回一個視圖：

In [53]:

view = probabilities.reshape( 6, -1 )
view[ 0 ] = 9
probabilities[ 0, 0 ]
Out[53]:
9.0

Python：從n維數組中定義的離散分布中采樣

問題描述

2 個解決方案

解決方案1
3 已采納 2014-07-05 00:40:33

解決方案2
2 2014-07-05 00:11:05

Python：從n維數組中定義的離散分布中采樣

問題描述

2 個解決方案

解決方案1 3 已采納 2014-07-05 00:40:33

解決方案2 2 2014-07-05 00:11:05

解決方案1
3 已采納 2014-07-05 00:40:33

解決方案2
2 2014-07-05 00:11:05