简体   繁体   中英

Efficient implementation of multi-dimensional arrays in Java?

As far as I understand (from answers such as this ), java has no native multi-dimensional continuous memory arrays ( unlike C#, for example ).

While the jagged array syntax (arrays of arrays) might be good for most applications, I would still like to know what's the best practice if you do want the raw efficiency of a continuous-memory array (avoiding unneeded memory reads)

I could of course use a single-dimensional array that maps to a 2D one, but I prefer something more structured.

it's not difficult to do it manually:

int[] matrix = new int[ROWS * COLS];

int x_i_j = matrix[ i*COLS + j ];

now, is it really faster than java's multi dimension array?

int x_i_j = matrix[i][j];

for random access, maybe. for continuous access, probably not - matrix[i] is almost certainly in L1 cache, if not in register cache. in best scenario, matrix[i][j] requires one addition and one memory read; while matrix[i*COLS + j] may cost 2 additions, one multiply, one memory read. but who's counting?

It depends on your access pattern. Using this simple program , comparing an int[][] with a 2D mapped over a 1D int[] array treated as a matrix, a native Java 2D matrix is:

  1. 25% faster when the row is on the cache, ie: accessing by rows:
  2. 100% slower when the row is not in the cache, ie: accessing by colums:

ie:

// Case #1
for (y = 0; y < h; y++)
    for (x = 0; x < w; x++)
        // Access item[y][x]

// Case #2
for (x = 0; x < w; x++)
    for (y = 0; y < h; y++)
        // Access item[y][x]

The 1D matrix is calculated as:

public int get(int x, int y) {
    return this.m[y * width + x];
}

If you really want more structure with a continuous-memory array, wrap it in an object.

public class My2dArray<T> {

  int sizeX;

  private T[] items;

  public My2dArray(int x, int y) {
    sizeX = x;
    items = new T[x*y];
  }

  public T elementAt(int x, int y) {
    return items[x+y*sizeX];
  }

}

Not a perfect solution, and you probably already know it. So consider this confirmation of what you suspected to be true.

Java only provides certain constructs for organizing code, so eventually you'll have to reach for a class or interface. Since this also requires specific operations, you need a class.

The performance impacts include creating a JVM stack frame for each array access, and it would be ideal to avoid such a thing; however, a JVM stack frame is how the JVM implements it's scoping. Code organization requires appropriate scoping, so there's not really a way around that performance hit that I can imagine (without violating the spirit of "everything is an object").

The most efficient method of implementing multi-dimensional arrays is by utilizing one-dimensional arrays as multi-dimensional arrays. See this answer about mapping a 2D array into a 1D array.

// 2D data structure as 1D array
int[] array = new int[width * height];
// access the array 
array[x + y * width] = /*value*/;

I could of course use a single-dimensional array that maps to a 2D one, but I prefer something more structured.

If you want to access array in a more structured manner, create a class for it:

public class ArrayInt {

    private final int[] array;
    private final int width, height;

    public ArrayInt(int width, int height) {
        array = new int[width * height];
        this.width = width;
        this.height = height;
    }

    public int getWidth() {
        return width;
    }

    public int getHeight() {
        return height;
    }

    public int get(int x, int y) {
        return array[x + y * width];
    }

    public void set(int x, int y, int value) {
        array[x + y * width] = value;
    }

}

If you wanted arrays of objects, you could use generics and define class Array<T> , where T is the object stored in the array.

Performance-wise, this will, in most cases, be faster than a multi-dimensional array in Java. The reasons can be found in the answers to this question .

Let's say you have a 2D array int[][] a = new int[height][width] , so by convention you have the indices a[y][x] . Depending on how you represent the data and how you access them, the performance varies in a factor of 20 :

二维阵列访问的比较

The code:

public class ObjectArrayPerformance {
    public int width;
    public int height;
    public int m[];

    public ObjectArrayPerformance(int w, int h) {
            this.width = w;
            this.height = h;
            this.m = new int[w * h];
    }

    public int get(int x, int y) {
            return this.m[y * width + x];
    }

    public void set(int x, int y, int value) {
            this.m[y * width + x] = value;
    }

    public static void main (String[] args) {
            int w = 1000, h = 2000, passes = 400;

            int matrix[][] = new int[h][];

            for (int i = 0; i < h; ++i) {
                    matrix[i] = new int[w];
            }

            long start;
            long duration;

            System.out.println("duration[ms]\tmethod");

            start = System.currentTimeMillis();
            for (int z = 0; z < passes; z++) {
                    for (int y = 0; y < h; y++) {
                        for (int x = 0; x < w; x++) {
                                    matrix[y][x] = matrix[y][x] + 1;
                            }
                    }
            }
            duration = System.currentTimeMillis() - start;
            System.out.println(duration+"\t2D array, loop on x then y");

            start = System.currentTimeMillis();
            for (int z = 0; z < passes; z++) {
                    for (int x = 0; x < w; x++) {
                            for (int y = 0; y < h; y++) {
                                    matrix[y][x] = matrix[y][x] + 1;
                            }
                    }
            }
            duration = System.currentTimeMillis() - start;
            System.out.println(duration+"\t2D array, loop on y then x");

            //

            ObjectArrayPerformance mt = new ObjectArrayPerformance(w, h);
            start = System.currentTimeMillis();
            for (int z = 0; z < passes; z++) {
                    for (int x = 0; x < w; x++) {
                            for (int y = 0; y < h; y++) {
                                    mt.set(x, y, mt.get(x, y) + 1);
                            }
                    }
            }
            duration = System.currentTimeMillis() - start;
            System.out.println(duration+"\tmapped 1D array, access trough getter/setter");

            //

            ObjectArrayPerformance mt2 = new ObjectArrayPerformance(w, h);
            start = System.currentTimeMillis();
            for (int z = 0; z < passes; z++) {
                    for (int x = 0; x < w; x++) {
                            for (int y = 0; y < h; y++) {
                                    mt2.m[y * w + x] = mt2.m[y * w + x] + 1;
                            }
                    }
            }
            duration = System.currentTimeMillis() - start;
            System.out.println(duration+"\tmapped 1D array, access through computed indexes, loop y then x");

            ObjectArrayPerformance mt3 = new ObjectArrayPerformance(w, h);
            start = System.currentTimeMillis();
            for (int z = 0; z < passes; z++) {
                    for (int y = 0; y < h; y++) {
                        for (int x = 0; x < w; x++) {
                                    mt3.m[y * w + x] = mt3.m[y * w + x] + 1;
                            }
                    }
            }
            duration = System.currentTimeMillis() - start;
            System.out.println(duration+"\tmapped 1D array, access through computed indexes, loop x then y");

            ObjectArrayPerformance mt4 = new ObjectArrayPerformance(w, h);
            start = System.currentTimeMillis();
            for (int z = 0; z < passes; z++) {
                    for (int y = 0; y < h; y++) {
                        int yIndex = y * w;
                        for (int x = 0; x < w; x++) {
                                    mt4.m[yIndex + x] = mt4.m[yIndex + x] + 1;
                            }
                    }
            }
            duration = System.currentTimeMillis() - start;
            System.out.println(duration+"\tmapped 1D array, access through computed indexes, loop x then y, yIndex optimized");
    }
}

We can conclude that linear access performance depends more on the way you process the array (lines then columns or the reverse?: performance gain = x10, much due to CPU caches) than the structure of the array itself (1D vs 2D : performance gain = x2).

If random access, the performance differences should be much lower, because the CPU caches have less effect.

Sample implementation, without a compiler. This is basically what C/C++ do behind the scenes when you access multidimensional arrays. You'll have to further define accessor behaviour when less than the actual dimensions are specified & so on. Overhead will be minimal and could be optimized further, but thats microoptimizing imho. Also, you never actually know what goes on under the hood after JIT kicks in.

class MultiDimentionalArray<T> {
//disclaimer: written within SO editor, might contain errors
    private T[] data;
    private int[] dimensions; //holds each dimensions' size

    public MultiDimensionalArray(int... dims) {
        dimensions = Arrays.copyOf(dims, dims.length);
        int size = 1;
        for(int dim : dims)
            size *= dim;
        data = new T[size];
    }

   public T access(int... dims) {
       int idx = 1;
       for(int i = 0; i < dims.length)
            idx += dims[i] * dimensions[i]; //size * offset
       return data[idx];
    }
}

If you cannot live without C constructs, there's always JNI.

Or you could develop your own Java-derived language (and VM and optimizing JIT compiler) that has a syntax for multidimensional continuous-memory arrays.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM