[英]Repeating elements in list equal number of times
我有一些创建数据的代码,然后我想对数据进行采样。
我的代码首先创建一系列向量,这些向量的间隔从指数分布z_exponential_layers
,然后变为new2
。
然后我取每个new2
向量,看看dz
多少次适合new2
每个元素之间的间隔。
例如,如果new2 = [z1,z2,z3,...,zn]
那么代码的第二部分旨在找出 dz 适合[z2-z1,z3-z2,...]
。 因此,如果(z2-z1)/dz = 5 = repeats
那么我将存储在列表中vec += [np.random.normal(0,1)]*repeats
然后移动到下一个间隔。
import numpy as np
import random
import matplotlib.pyplot as plt
z_max = 100
dz = .01
#Intensity of process
lam = 0.1
#Number of rays
rays = 10
k = int(z_max/dz)
print('k =',k)
#List to store ray z coordinate data
exponential_procs_lists = []
for ray_n in range(2,rays):
process_length = 10000
#Compute data for z layers coordinate
z_exponential_layers = np.cumsum(np.random.exponential(lam,size = process_length))
#Cutoff values that lie outside z_max
cutoff = [x for x in z_exponential_layers if x < z_max]
#Append 0 at the start of exponential z vector
if min(cutoff) == 0:
new1 = np.insert(cutoff,0,0)
else:
new1 = cutoff
#Append z_max at end of exponential z vector
new2 = np.insert(new1,len(new1),z_max)
#Append exponential ray z vector to list
exponential_procs_lists.append(new2)
#Create list that will store random numbers data for each ray
big_list = []
#Loop over every ray, check how many dz lie within each layer and assign random variable (k total times)
for list_n in exponential_procs_lists:
#Create empty list to store random data for each ray
vec = []
#Sum repeats checks that there is k elements in each vector, since k = int(z_max/dz)
sum_repeats = 0
#Calculate the intervals between each layers coordinate vector
list_n_diff = np.diff(list_n)
for item in list_n_diff:
#Calculate how many dz fit inside each interval
repeats = int(item/dz)
#Repeat random variable 'repeats' times. This ensures that if we sample x times and each
#time we are in the same interval, that random variable is repeated
vec += [np.random.normal(0,1)]*repeats
#Update sum_repeats to check that there is K elements in the vector
sum_repeats += repeats
#Print to check sum_repeats equals k in each running of the whole calculation (we in first loop here)
print('sum repeats =',sum_repeats)
print('mean interval size =',np.mean(np.diff(new2)))
#Append m(z) data to the main list, and repeat for each ray
#Big list is a list of lists, so we must now transform it into a matrix form (np.array)
big_list.append(vec)
问题是,当我运行这段代码时,包含随机变量的每个vec
的长度不等于k
并且每次都会改变。 例如,一次运行给出
k = 10000
sum repeats = 9507
mean interval size = 0.0992551287849846
sum repeats = 9493
mean interval size = 0.0992551287849846
sum repeats = 9500
mean interval size = 0.0992551287849846
sum repeats = 9500
mean interval size = 0.0992551287849846
sum repeats = 9479
mean interval size = 0.0992551287849846
sum repeats = 9508
mean interval size = 0.0992551287849846
sum repeats = 9509
mean interval size = 0.0992551287849846
sum repeats = 9485
mean interval size = 0.0992551287849846
如何确保每个向量中的随机元素数等于k
?
为简单起见,假设 dz = 1:
您正在生成一个随机数列表 (z_exponential_layers),这些随机数始终为正数且大于之前的数 (cumsum)。
然后,您切断(切断)高于上限(z_max)的任何数字
因此,截止列表中的数字在以下范围内: (0, z_max) 。 开放边界是因为 exp(x) > 0 和严格小于 z_max 的条件。
考虑到这一点,截止值是:[z_0, z_1, ..., z_n],其中 z_0 > 0 和 z_n < z_max,以及 z_(i+1) > z_i
通过使用 np.diff,您将生成范围 (0, z_max) 中每个差异的向量,并且所有这些值的总和等于 (z_n - z_0),逻辑上小于 z_max。
添加您正在截断差异的事实(通过使用 int(item/dz)),您增加了不等式:
重复 = z_n - z_0 - round_down_loss < z_max。
因此,为了获得repeats = z_max,您需要使z_0 = 0(即,如果min(cutoff) > 0,则插入0)。 然后,你的结果将是 repeats = z_n - z_0 - round_down_loss = z_n - round_down_loss < z_max
如果你去掉四舍五入,你会得到:
repeats = z_n,仍然小于 z_max。
如果您随机生成的增量(使用 np.exponential)变得无穷小,那么您可以渐近地获得它:
重复 -> z_max
考虑到这一点,我提出以下更正:
if min(cutoff) > 0: # Instead of equal
new1 = np.insert(cutoff,0,0)
else:
new1 = cutoff
...
for item in list_n_diff:
#Calculate how many dz fit inside each interval
repeats = item/dz # remove int because it truncates the values
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.