简体   繁体   English

numpy.random.normal不同分布:从分布中选择值

[英]numpy.random.normal different distribution: selecting values from distribution

I have a power-law distribution of energies and I want to pick n random energies based on the distribution. 我有一个能量的幂律分布,我想根据该分布选择n个随机能量。 I tried doing this manually using random numbers but it is too inefficient for what I want to do. 我尝试使用随机数手动执行此操作,但对于我想做的事情效率太低。 I'm wondering is there a method in numpy (or other) that works like numpy.random.normal , except instead of a using normal distribution, the distribution may be specified. 我想知道numpy(或其他)中是否有一种方法可以像numpy.random.normal一样numpy.random.normal ,除了可以使用指定的分布而不是使用正态分布。 So in my mind an example might look like (similar to numpy.random.normal): 所以在我看来,一个例子可能看起来像(类似于numpy.random.normal):

import numpy as np

# Energies from within which I want values drawn
eMin = 50.
eMax = 2500.

# Amount of energies to be drawn
n = 10000

photons = []

for i in range(n):

    # Method that I just made up which would work like random.normal,
    # i.e. return an energy on the distribution based on its probability,
    # but take a distribution other than a normal distribution
    photons.append(np.random.distro(eMin, eMax, lambda e: e**(-1.)))

print(photons)

Printing photons should give me a list of length 10000 populated by energies in this distribution. 打印photons应该给我一个由该分布中的能量组成的长度为10000的列表。 If I were to histogram this it would have much greater bin values at lower energies. 如果我要进行直方图显示,则在较低能量下它将具有更大的bin值。

I am not sure if such a method exists but it seems like it should. 我不确定是否存在这种方法,但似乎应该如此。 I hope it is clear what I want to do. 我希望很清楚我想做什么。

EDIT: 编辑:

I have seen numpy.random.power but my exponent is -1 so I don't think this will work. 我看过numpy.random.power但是我的指数是-1,所以我认为这不起作用。

Sampling from arbitrary PDFs well is actually quite hard. 很好地从任意PDF进行采样实际上是非常困难的。 There are large and dense books just about how to efficiently and accurately sample from the standard families of distributions. 大量且密集的书籍 ,内容涉及如何从标准发行版系列中有效和准确地进行抽样。

It looks like you could probably get by with a custom inversion method for the example that you gave. 对于给出的示例,似乎可以使用自定义反转方法来解决。

If you want to sample from an arbitrary distribution you need the inverse of the cumulative density function (not the pdf). 如果要从任意分布中采样,则需要累积密度函数(而不是pdf)的反函数。

You then sample a probability uniformly from range [0,1] and feed this into the inverse of the cdf to get the corresponding value. 然后,您从范围[0,1]均匀采样概率,并将其输入cdf的倒数以获得相应的值。

It is often not possible to obtain the cdf from the pdf analytically. 通常无法通过解析方式从pdf获取cdf。 However, if you're happy to approximate the distribution, you could do so by calculating f(x) at regular intervals over its domain, then doing a cumsum over this vector to get an approximation of the cdf and from this approximate the inverse. 但是,如果您乐于近似于分布,则可以通过在其域上以规则的间隔计算f(x),然后对该向量求和以获得cdf的近似值,并从该近似值中取反值。

Rough code snippet: 粗糙的代码片段:

import matplotlib.pyplot as plt
import numpy as np
import scipy.interpolate

def f(x):
   """
   substitute this function with your arbitrary distribution
   must be positive over domain
   """
   return 1/float(x)


#you should vary inputVals to cover the domain of f (for better accurracy you can
#be clever about spacing of values as well). Here i space them logarithmically
#up to 1 then at regular intervals but you could definitely do better
inputVals = np.hstack([1.**np.arange(-1000000,0,100),range(1,10000)])

#everything else should just work
funcVals = np.array([f(x) for x in inputVals])
cdf = np.zeros(len(funcVals))
diff = np.diff(funcVals)
for i in xrange(1,len(funcVals)):
   cdf[i] = cdf[i-1]+funcVals[i-1]*diff[i-1]
cdf /= cdf[-1]

#you could also improve the approximation by choosing appropriate interpolator
inverseCdf = scipy.interpolate.interp1d(cdf,inputVals)

#grab 10k samples from distribution
samples = [inverseCdf(x) for x in np.random.uniform(0,1,size = 100000)]

plt.hist(samples,bins=500)
plt.show()

Why don't you use eval and put the distribution in a string? 您为什么不使用eval并将分布放在字符串中?

>>> cmd = "numpy.random.normal(500)"
>>> eval(cmd)

you can manipulate the string as you wish to set the distribution. 您可以根据需要设置分布来操纵字符串。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM