简体   繁体   English

生成只有 6 个十进制数的数据集

[英]Generate a dataset with only 6 decimal numbers

Is there a way to generate a (random) dataset that is filled with values having 6 decimals and 1 number before the decimal separator?有没有办法生成一个(随机)数据集,该数据集填充有 6 位小数和小数点分隔符前 1 位数字的值?

So for example like this:所以例如像这样:

 "A":[5.398811, 2.232098, 9.340909, 3.343434],
 "B":[6.436293,5.293756, 1.235937, 1.987384],
 "C": [3.572831, 3.826355, 3.827264, 3.257321]

I found that round(random.uniform(33.33, 66.66), 2) returns a random float number up to 2 decimal places.我发现 round(random.uniform(33.33, 66.66), 2) 返回一个随机浮点数,最多 2 位小数。 However, I don't want a dataframe filled "up to" 2 decimal places but a dataframe filled with only 6 decimal places.但是,我不希望 dataframe 填充“最多”2 位小数,但 dataframe 只填充 6 位小数。 I would like to have about 1000 rows and 100 columns.我想要大约 1000 行和 100 列。

EDIT: It would also be nice to not have any 0's or 9's in any of the decimals.编辑:在任何小数点中没有任何 0 或 9 也很好。 This because I am looking into rounding decimals.这是因为我正在研究四舍五入的小数点。 When rounding 1.999999 to 5 decimals, one wil get 2.00000 which is 2. Which then will not give a reliable rounding result.当将 1.999999 舍入到 5 位小数时,将得到 2.00000,即 2。这样就不会给出可靠的舍入结果。 Don't know to what extend that's actually feasible.不知道在多大程度上这实际上是可行的。

You can use numpy.random.uniform for efficiency, then convert to dictionary:您可以使用numpy.random.uniform提高效率,然后转换为字典:

import numpy as np
col,row = (10,20)  # (100, 1000) in your case
out = dict(enumerate(np.random.uniform(0,10,size=col*row)
                       .round(6).reshape(row,col).tolist()))

print(out)

output: output:

{0: [5.488135, 7.151894, 6.027634, 5.448832, 4.236548, 6.458941, 4.375872, 8.91773, 9.636628, 3.834415],
 1: [7.91725, 5.288949, 5.680446, 9.255966, 0.710361, 0.871293, 0.202184, 8.326198, 7.781568, 8.700121],
 2: [9.786183, 7.991586, 4.614794, 7.805292, 1.182744, 6.39921, 1.433533, 9.446689, 5.218483, 4.146619],
...
 19: [3.982211, 2.098437, 1.86193, 9.443724, 7.395508, 4.904588, 2.274146, 2.543565, 0.580292, 4.344166],
}

NB.注意。 note that the numbers will be UP TO 6 decimal digits (eg, 0.123400 will be shown as 0.1234, forcing otherwise would create a non-random bias请注意,数字最多为 6 位小数(例如,0.123400 将显示为0.1234 ,否则会产生非随机偏差

pure python version (less efficient):纯 python 版本(效率较低):

import random
out = {i: [round(random.uniform(0, 10), 6) for j in range(100)]
       for i in range(1000)}

exactly 6 digits恰好 6 位数字

You can check if the rounded number has a zero on the 6th decimal place, and in this case add an arbitrary number.您可以检查四舍五入的数字是否在小数点后第六位为零,在这种情况下添加一个任意数字。 Here is an example, initial dataset:这是一个示例,初始数据集:

np.random.seed(0) # for reproducibility
a = np.random.uniform(0, 10, size=20).round(6)

array([5.488135, 7.151894, 6.027634, 5.448832, 4.236548, 6.458941,
       4.375872, 8.91773 , 9.636628, 3.834415, 7.91725 , 5.288949,
       5.680446, 9.255966, 0.710361, 0.871293, 0.202184, 8.326198,
       7.781568, 8.700121])

With correction:更正:

np.random.seed(0) # for reproducibility
a = np.random.uniform(0, 10, size=20).round(6)
# identify numbers ending in 0
mask = (a*1e6).astype(int)%10==0
# add a terminal 1
a[mask] += 1e-6
a

array([5.488135, 7.151894, 6.027634, 5.448832, 4.236548, 6.458941,
       4.375872, 8.917731, 9.636628, 3.834415, 7.917251, 5.288949,
       5.680446, 9.255966, 0.710361, 0.871293, 0.202184, 8.326198,
       7.781568, 8.700121])

This works by multiplying by 1e6 as integer and getting the remainder of division by 10:这通过将 1e6 乘以 integer 并将余数除以 10 来实现:

(a*1e6).astype(int)%10

array([5, 4, 4, 2, 8, 1, 2, 0, 8, 5, 0, 9, 6, 6, 1, 3, 4, 8, 8, 1])

example with DataFrame以 DataFrame 为例

import numpy as np
col,row = (4,5)  # (100, 1000) in your case
a = np.random.uniform(0,10,size=col*row).round(6).reshape(row,col)
mask = (a*1e6+1).astype(int)%10<2
# add a terminal 1
a[mask] += 2e-6

df = pd.DataFrame(a)

print(df)

Output: Output:

          0         1         2         3
0  5.488135  7.151894  6.027634  5.448832
1  4.236548  6.458941  4.375872  8.917732
2  9.636628  3.834415  7.917252  5.288951
3  5.680446  9.255966  0.710361  0.871293
4  0.202184  8.326198  7.781568  8.700121

Maybe try to generate numbers from 1,000,000 to 9,999,999 and then divide by 1,000,000.也许尝试生成从 1,000,000 到 9,999,999 的数字,然后除以 1,000,000。 This will ensure that the numbers are always exactly 6 decimals.这将确保数字始终正好是 6 位小数。

Regarding the second condition, you can run a check on the number by casting to a string, something like:关于第二个条件,您可以通过转换为字符串来检查数字,例如:

if '9' in str(the_number): 
    continue
else:
    result.append(the_number)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM