简体   繁体   English

如何最好地比较 Java 整数集?

[英]How best to compare Java sets of integers?

While learning Java I'm using the lottery like many to sharpen my new skills.在学习 Java 时,我像许多人一样使用彩票来提高我的新技能。 As an exercise I want to create all 13.9M 6 ball combinations for a given random seed.作为一个练习,我想为给定的随机种子创建所有 13.9M 6 个球组合。 I've managed to generate the lines OK for a given seed but I'm not currently checking they're unique and I'm therefore getting duplicates.我已经成功地为给定的种子生成了行,但我目前没有检查它们是否唯一,因此我得到了重复。

What I'd like advice on his what approach to take to check each generated line against the previously generated lines?我想对他采取什么方法来检查每个生成的行与先前生成的行有什么建议? I'm currently using a set to hold the 6 numbers in each line and was wondering if I should be comparing sets or if I should use a list or something else?我目前正在使用一个集合来保存每行中的 6 个数字,并且想知道我是否应该比较集合还是应该使用列表或其他东西?

All advice appreciated:-)所有建议表示赞赏:-)

You could use containsAll() method. 您可以使用containsAll()方法。 There are ways of generating unique combinations rather than comparing everything. 有生成唯一组合而不是比较所有内容的方法。 Please share some code. 请分享一些代码。

Set<Integer> nums = new HashSet<Integer>(Arrays.asList(new Integer[] {1, 2, 3, 4, 5}));
Set<Integer> nums2 = new HashSet<Integer>(Arrays.asList(new Integer[] {1, 2, 6, 4, 5}));
System.out.println(nums.containsAll(nums2));

With those amounts of numbers, I'd suggest adding each line to a Set (like HashSet). 有了这些数量的数字,我建议将每行添加到Set中(例如HashSet)。 Now, if you have your own class that represents those lines, you're going to have to override the equals and hashcode methods in that class. 现在,如果您有自己的代表这些行的类,则必须重写该类中的equals和hashcode方法。 But once you've done that, after each addition of the line to the Set, you can see if the set has increased in size with Set.size(). 但是,一旦完成此操作,则在向Set每次添加行之后,都可以使用Set.size()来查看set的大小是否增加。 If it hasn't, it was a duplicate. 如果没有,则为重复项。 This way the algorithm for checking for duplicates is an order of magnitude less compared to any other solution really. 这样,用于检查重复项的算法实际上比任何其他解决方案都小一个数量级。

(Reading up a bit on it for a lottery 6 numbers are chosen between 1 and 49 and a number cannot be repeated twice) (在彩票上读一点就可以在1到49之间选择6个数字,并且一个数字不能重复两次)

If you wish to make sure a combination is unique by making sure it has not been used before you can do something like 如果您希望在执行类似操作之前先确定该组合未被使用,以确保它是唯一的

Set<Set<Integer>> previousCombinations = new HashSet<Set<Integer>>();
...
Set<Integer> newCombination = new HashSet<Integer>(Arrays.asList(22, 10, 1, 14, 45, 14));
if (!previousCombinations.add(newCombination)) {
    // its a duplicate
} else {
    // its not a duplicate
}

However another solution which does not require checking previously generated combinations which you might fight interesting is as follows: 但是,另一种不需要检查以前生成的组合的解决方案如下:

This will generate random combinations of lottery numbers that are guaranteed to be unique until all possible combinations are exhausted without having to check all previous randomly generated combinations. 这将生成彩票号码的随机组合,这些彩票组合将保证是唯一的,直到用尽所有可能的组合,而无需检查所有先前的随机生成的组合。 Note that the lotto numbers are returned sorted in ascending order so to make it more realistic you can shuffle them. 请注意,乐透号码以升序返回,因此为了使其更真实,您可以对其进行洗牌。

This works by using 这可以通过使用

1) A custom random number generator that will generate all of the unique numbers of a custom range in a random order if you keep calling the getNextValue() method without repeats until all numbers have been returned (after which it loops) 1)一个自定义随机数生成器,如果您不断调用getNextValue()方法而没有重复直到返回所有数字(此后循环),它将以随机顺序生成自定义范围的所有唯一数字。

2) An algorithm that given a number between 0 and a total number of combinations and returns the actual combination that corresponds to that number. 2)一种算法,给出一个介于0和组合总数之间的数字,并返回对应于该数字的实际组合。 For example in our case 0 corresponds to [0, 1, 2, 3, 4, 5] and 13983815 corresponds to [43, 44, 45, 46, 47, 48] 例如,在我们的情况下,0对应于[0、1、2、3、4、5],13998315对应于[43、44、45、46、47、48]

  public static void main(String[] args) throws Throwable {
    for (int i = 0; i < 1000000; i++) {
      System.out.println(Arrays.toString(getRandomLottoNumbers()));
    }
  }

  private static final RandomFunction lottoFunction = new RandomFunction(choose(49,
      6));

  public static int[] getRandomLottoNumbers() {
    int[] combination = mthCombination(lottoFunction.getNextValue(), 49, 6);
    for (int i = 0; i < combination.length; i++) {
        combination[i]++;
    }
  }

  // Based on http://en.wikipedia.org/wiki/Linear_congruential_generator
  public static final class RandomFunction {

    private final long a;

    private final long c;

    private final long m;

    private int curr = 0;

    public RandomFunction(int period) {
      m = period;
      List<Integer> primes = primeFactors(period);
      Set<Integer> uniquePrimes = new HashSet<Integer>(primes);

      long aMinusOne = 1;
      if (primes.size() >= 2 && primes.get(0) == 2 && primes.get(1) == 2) {
        aMinusOne = 2;
      }

      for (Integer prime : uniquePrimes) {
        aMinusOne *= prime;
      }

      // make 'a' random
      int rand = (int) (1 + (1000 * Math.random()));

      a = (aMinusOne * rand) + 1;

      int potentialC = 0;
      while (potentialC <= 1) {
        potentialC = 2 + (int) (period * Math.random());

        for (Integer prime : uniquePrimes) {
          while (potentialC % prime == 0) {
            potentialC /= prime;
          }
        }
      }
      c = potentialC;
      curr = (int) (period * Math.random());
    }

    public int getNextValue() {
      curr = (int) ((a * curr + c) % m);
      return curr;
    }
  }

  // Based on http://www.codeguru.com/cpp/cpp/algorithms/general/article.php/c16255/Linear-Search-based-algorithm-for-Mth-Lexicographic-ordering-of-Mathematical-Permutation-and-Combina.htm
  public static int[] mthCombination(int m, int n, int k) {
    if (k == 0) {
      return new int[0];
    }
    if (k == n) {
      int[] result = new int[k];
      for (int i = result.length - 1; i >= 0; i--) {
        result[i] = --k;
      }
      return result;
    }
    int subChoose = choose(n - 1, k - 1);
    if (m < subChoose) {
      int[] subResult = mthCombination(m, n - 1, k - 1);
      int[] result = new int[subResult.length + 1];
      for (int i = 0; i < subResult.length; i++) {
        result[i + 1] = subResult[i] + 1;
      }
      return result;
    } else {
      int[] result = mthCombination(m - subChoose, n - 1, k);
      for (int i = 0; i < result.length; i++) {
        result[i]++;
      }
      return result;
    }
  }

  public static int choose(int n, int k) {
    if (k < 0 || k > n) {
      return 0;
    }
    if (k > n / 2) {
      k = n - k;
    }

    long denominator = 1;
    long numerator = 1;
    for (int i = 1; i <= k; i++) {
      denominator *= i;
      numerator *= (n + 1 - i);
    }
    return (int) (numerator / denominator);
  }

  public static List<Integer> primeFactors(int number) {
    List<Integer> primeFactors = new ArrayList<Integer>();
    if (number < 1) {    
      return primeFactors;
    }

    while (number % 2 == 0) {
      primeFactors.add(2);
      number /= 2;
    }
    for (int i = 2; i * i <= number; i += 2) {
      while (number % i == 0) {
        primeFactors.add(i);
        number /= i;
      }
    }
    if (number != 1) {
      primeFactors.add(number);
    }

    return primeFactors;
  }

You can simply use the equals method of the Set interface:您可以简单地使用 Set 接口的 equals 方法:

Set<Integer> myFirstSetOfIntegers = Set.of(1,3,17);
Set<Integer> mySecondSetOfIntegers = Set.of(1,3,17);
myFirstSetOfIntegers.equals(mySecondSetOfIntegers)

Check the javadoc description of Set.equals.检查 Set.equals 的 javadoc 描述。 It does exactly what you're looking for:它完全符合您的要求:

* Compares the specified object with this set for equality.  Returns
* {@code true} if the specified object is also a set, the two sets
* have the same size, and every member of the specified set is
* contained in this set (or equivalently, every member of this set is
* contained in the specified set).  This definition ensures that the
* equals method works properly across different implementations of the
* set interface.

I'd also suggest to look into other data structures, in case you're dealing with combinatorially complex algorithms.我还建议研究其他数据结构,以防您处理组合复杂的算法。 For instance, you could represent each 6 ball combination with a BitSet.例如,您可以用 BitSet 表示每个 6 球组合。 The BitSet::equals should be much faster for your use case compared to Set::equals.对于您的用例,BitSet::equals 应该比 Set::equals 快得多。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM