简体   繁体   English

C++ 如何生成 10,000 个唯一的随机整数以存储在 BST 中?

[英]C++ How to generate 10,000 UNIQUE random integers to store in a BST?

I am trying to generate 10,000 unique random integers in the range of 1 to 20,000 to store in a BST, but not sure the best way to do this.我正在尝试生成 1 到 20,000 范围内的 10,000 个唯一随机整数以存储在 BST 中,但不确定执行此操作的最佳方法。

I saw some good suggestions on how to do it with an array or a vector, but not for a BST.我看到了一些关于如何使用数组或向量而不是 BST 的好建议。 I have a contains method but I don't believe it will work in this scenario as it is used to search and return results on how many tries it took to find the desired number.我有一个contains方法,但我不相信它会在这种情况下工作,因为它用于搜索和返回有关找到所需数字的尝试次数的结果。 Below is the closest I've gotten but it doesn't like my == operator.下面是我得到的最接近的,但它不喜欢我的==运算符。 Would it be better to use an array and just store the array in the BST?使用数组并将数组存储在 BST 中会更好吗? Or is there a better way to use the below code so that while it's generating the numbers it's just storing them right in the tree?或者有没有更好的方法来使用下面的代码,以便在生成数字时将它们直接存储在树中?

for (int i = 0; i < 10000; i++) 
{
    int random = rand() % 20000;
    tree1Ptr->add(random);
    for (int j = 0; j < i; j++) {
        if (tree1Ptr[j]==random) i--;
        }
    }

There are a couple of problems in your code.您的代码中有几个问题。 But let's go straight to the hurting point.但让我们直奔痛点。

What's the main problem ?主要问题是什么?

From your code, it is obvious that tree1Ptr is a pointer.从您的代码中,很明显tree1Ptr是一个指针。 In principle, it should point to a node of the tree, which has two pointers, one to the left node and one to the right node.原则上,它应该指向树的一个节点,它有两个指针,一个指向左节点,一个指向右节点。

So somewhere in your code, you should have:所以在你的代码中,你应该有:

tree1Ptr = new Node;   // or whatever the type of your node is called

However, in your inner loop, you are just using it as if it was an array:但是,在您的内部循环中,您只是像使用数组一样使用它:

for (int i = 0; i < 10000; i++) 
{
    int random = rand() % 20000;
    tree1Ptr->add(random);
    for (int j = 0; j < i; j++) {
        if (tree1Ptr[j]==random)  //<============ OUCH !!
            i--;
    }
}

The compiler won't complain, because it's valid syntax: you can use array indexing on a pointer.编译器不会抱怨,因为它是有效的语法:您可以在指针上使用数组索引。 But it's up to you to ensure that you don not go out of bounds (so here, that j remains <1).但是由你来确保你不会越界(所以在这里,j 保持 <1)。

Other remarks其他备注

By the way, in the inner loop, you just want to say that you have to retry if the number is found.顺便说一句,在内部循环中,您只想说如果找到号码就必须重试。 You can break the inner loop if the number is already found, in order not to continue.如果已经找到数字,您可以break内部循环,以免继续。

You should also seed your random number generator, to avoid running the program always with the same sequence.您还应该为随机数生成器设置种子,以避免始终以相同的顺序运行程序。

How to solve it ?如何解决?

You really need to deepen your understanding of BST.你真的需要加深对BST的理解。 Navigating through the node requires make comparison with the value in the current node, and depending on the result, iterate continuing either with the left or the right pointer, not using indexing.浏览节点需要与当前节点中的值进行比较,并根据结果,使用左指针或右指针继续迭代,不使用索引。 But it would be too long to explain here.但在这里解释太长了。 So may be you should look for a tutorial, like this one所以可能你应该找一个教程,就像这个

For a lot of unique 'random' numbers I usually use a Format Preserving Encryption .对于许多独特的“随机”数字,我通常使用Format Preserving Encryption Since encryption is one-to-one, you are guaranteed unique outputs as long as the inputs are unique.由于加密是一对一的,只要输入是唯一的,就可以保证唯一的输出。 A different encryption key will generate a different set of outputs, ie a different permutation of the inputs.不同的加密密钥将产生一组不同的输出,即输入的不同排列。 Simply encrypt 0, 1, 2, 3, 4, ... and the outputs are guaranteed unique.只需加密 0, 1, 2, 3, 4, ... 并保证输出是唯一的。

You want numbers in the range [1 .. 20,000].您需要 [1 .. 20,000] 范围内的数字。 Unfortunately 20,000 needs 21 bits and most encryption schemes have an even number of bits: 22 bits in your case.不幸的是,20,000 需要 21 位,而大多数加密方案都有偶数位:在您的情况下为 22 位。 That means you will need to cycle walk;这意味着您需要骑自行车; re-encrypt the output if the number is too big until you get a number in the desired range.如果数字太大,则重新加密输出,直到获得所需范围内的数字。 Since your inputs only go up to 10,000 and you will be cycle walking above 20,000 you will still avoid duplicates.由于您的输入最多只能达到 10,000,并且您将在 20,000 以上骑自行车,因此您仍然可以避免重复。

The only standard cipher I know of which allows a 22 bit block size is Hasty Pudding cipher.我所知道的唯一允许 22 位块大小的标准密码是 Hasty Pudding 密码。 Alternatively it is easy enough to write your own simple Feistel cipher .或者,编写自己的简单Feistel 密码也很容易。 Four rounds are enough if you do not want cryptographic security.如果您不想要加密安全性,四轮就足够了。 For crypto level security you will need to use AES/FFX, which is NIST approved.对于加密级别的安全性,您需要使用 NIST 批准的 AES/FFX。

There are two ways where you can pick random unique numbers out of a sequence without checking against the numbers previously picked (ie already in your BST).有两种方法可以从序列中随机选择唯一的数字,而无需检查先前选择的数字(即已经在您的 BST 中)。

Use random_shuffle使用 random_shuffle

A simple way is to shuffle a sorted array of 1 ... 20,000 and simply pick the first 10,000 items:一个简单的方法是将一个 1 ... 20,000 的排序数组打乱,然后简单地选择前 10,000 个项目:

#include <algorithm>
#include <vector>

std::vector<int> values(20000);
for (int i = 0; i < 20000; ++i) {
  values[i] = i+1;
}
std::random_shuffle(values.begin(), values.end());

for (int i = 0; i < 10000; ++i) {
  // Insert values[i] into your BST
}

This method works well if the size of random numbers (10,000) to pick is comparable to the size of total numbers (20,000), because the complexity of random shuffling is amortized over a larger result set.如果要选取的随机数 (10,000) 的大小与总数 (20,000) 的大小相当,则此方法很有效,因为随机洗牌的复杂性会在更大的结果集上分摊。

Use uniform_int_distribution使用uniform_int_distribution

If the size of random numbers to pick is much smaller than the size of total numbers, then an alternative way can be used:如果要选择的随机数的大小远小于总数的大小,则可以使用另一种方法:

#include <chrono>
#include <random>
#include <vector>

// Use timed seed so every run produces different random picks.
std::default_random_engine reng(
    std::chrono::steady_clock::now().time_since_epoch().count());

int num_pick  = 1000;   // # of random numbers remained to pick
int num_total = 20000;  // Total # of numbers to pick from

int cur_value = 1;  // Current prospective number to be picked
while (num_pick > 0) {
  // Probability to pick `cur_value` is num_pick / (num_total-cur_value+1)
  std::uniform_int_distribution<int> distrib(0, num_total-cur_value);

  if (distrib(reng) < num_pick) {
    bst.insert(cur_value);  // insert `cur_value` to your BST
    --num_pick;
  }
  ++cur_value;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM