简体   繁体   English

Java:内存有效存储整数数组

[英]Java: memory efficient storing of arrays of integers

Premise : This problem might be already known, and I might be using the wrong wording, please refer me elsewhere if this is the case. 前提 :可能已经知道此问题,并且我可能使用了错误的措词,如果是这种情况,请在其他地方向我推荐。

Quick Problem Overview : I have to store a high number of arrays of integers in order to avoid duplication. 快速问题概述 :为了避免重复,我必须存储大量的整数数组。 I am doing the following: 我正在执行以下操作:

LinkedList<int[]> ArraysAlreadyUsed;

Upon using an array, I add it to the list. 使用数组时,我将其添加到列表中。 Before using an array I see if it is in the list. 在使用数组之前,请查看它是否在列表中。 Since I need to use many high dimensional arrays I run into memory issues. 由于我需要使用许多高维数组,因此遇到了内存问题。

Question : What is a good/the best way of doing this in order to minimize the amount of memory occupied? 问题 :为了最大程度地减少占用的内存,执行此操作的最佳/最佳方法是什么? Is there a way to represent such arrays with a hash string? 有没有办法用哈希字符串表示这样的数组? And would this be better? 会更好吗?

It may make sense to create a wrapper that implements equals and hashcode so that you can place the arrays in a Set for O(1) contains / add . 创建一个实现equalshashcode的包装器可能是有意义的,这样您就可以将数组放在O(1) contains / addSet中。 Something like: 就像是:

public class IntArray {
  private final int[] array;
  private final int hash;

  public IntArray(int[] array) {
    this.array = array;
    this.hash = Arrays.hashCode(this.array); //cache hashcode for better performance
  }

  @Override
  public int hashCode() {
    return hash;
  }

  @Override
  public boolean equals(Object obj) {
    if (obj == null) return false;
    if (getClass() != obj.getClass()) return false;
    final IntArray other = (IntArray) obj;
    return Arrays.equals(this.array, other.array);
  }
}

You can then simply use a set: 然后,您可以简单地使用一组:

Set<IntArray> arrays = new HashSet<> ();

That will create a small overhead (guestimate less than 20 bytes per wrapper) but will perform much better than your LinkedList. 这将产生少量开销(每个包装器估计少于20个字节),但性能将比LinkedList好得多。

If memory is your only concern then you could go for an int[][] but that will be more painful... 如果只有内存是您的问题,那么您可以选择int[][]但这会更加痛苦。

If you need to check the presence of an element in a data structure the best solution is to use a Map . 如果需要检查数据结构中元素的存在,最好的解决方案是使用Map So use an HashMap . 因此,请使用HashMap

Retrieve of elements happens in O(1) . 元素的检索发生在O(1)中 In a list ( LinkedList or ArrayList ) the search happens in O(n) . 在列表( LinkedListArrayList )中,搜索发生在O(n)中

A linked list is also a poor choice in term of memory occupation. 在内存占用方面,链表也是一个糟糕的选择。 Infact for each element you have a reference to the previous element and a reference to the next element. 实际上,对于每个元素,您都有对上一个元素的引用和对下一个元素的引用。

Just in term of memory occupations the best solution is using an array of int (not an ArrayList ) with a reference to the last inserted id. 就内存占用而言,最好的解决方案是使用int数组(而不是ArrayList ),并引用最后插入的id。

使用BitSet代替int[]可能会减少内存占用。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM