How to represent graph with million vertices in C?

Question

I have graph with size 1000x1000, so there are one million vertices. From every vertex I can go just down or right, so from almost every vertex there are two edges and every vertex has one number.

I must find the largest path in this graph, but I don´t know how to represent it. I tried matrix, but it is just too big, I suppose, because my program will stop working, if I had int array[1000][1000] . In Pascal I have seen array [1..1000][1..1000] , but in C it isn't working.

EDIT: Thanks, I forgot to use dynamic allocation on all of the arrays. That solves the problem. :)

EDIT2: Now there is another problem :/ When I use 100x100 vertices, my program is working, but when i use 1000x1000 vertices, after while program will fall down. Here is the selected part of code, which isn't working. Try to change n = 1000; m = 1000; to n = 100; m = 100; and program will continue till the end.

#include<stdio.h>
#include<stdint.h>
int main()
{
  int64_t i,j,n,m,cukriky = 0,q = 0;
  n = 1000;
  m = 1000;
  int (*hall)[m],(*matrix)[m*n],(*d)[m*n];
  hall = (int (*)[m])malloc(n * m * sizeof(int));
  matrix = (int (*)[m*n])malloc(n*m * n*m * sizeof(int));
  d = (int (*)[m*n])malloc(n*m * n*m * sizeof(int));
  //while((c = getchar())!= '\n');
  for(i = 0; i < n; i++){
       for(j = 0; j < m; j++){
            hall[i][j] = -i;}
  }
  /*for(i = 0; i < n*m; i++){
       for(j = 0; j < n*m; j++)
            matica[i][j] = 0;
  }*/
  matrix[0][1] = hall[0][0]+hall[0][1];
  matrix[0][m] = hall[0][0]+hall[1][0];
  for(i = 2; i < m*n; i++){
       if(i % m != 0){
          matrix[i-1][i] = hall[q][i%m];
          printf("A %lld %lld matrix[%lld][%lld],hall[%lld][%lld], %d   %d\n",i,m,i-1,i,q,i%m, matrix[i-1][i],hall[q][i%m]);
       }
       if(i+m-1 < m*n){
         matrix[i-1][i+m-1] = hall[q+1][(i-1)%m];
         printf("B %lld %lld matrix[%lld][%lld],hall[%lld][%lld], %d   %d\n",i,m,i-1,i+m-1,q+1,(i-1)%m, matrix[i-1][i+m-1],hall[q+1][(i-1)%m]);
       }
       if(i % (m) == 0)
          q++;
  }
 printf("A");


}

It appears like there do not exist matrix[347]0.

Answer 1

If most of the entires are zero in your graph, then you can use sparse matrix for representing graph.

In numerical analysis, a sparse matrix is a matrix in which most of the elements are zero. By contrast, if most of the elements are nonzero, then the matrix is considered dense. The fraction of zero elements over the total number of elements in a matrix is called the sparsity (density).

Read more here

Answer 2

You should use an adjacency list , in which you just store the neighbours of each node. Since your graph is sparse, the adjacency list will occupy way less memory than a full adjacency matrix.

You can even come up with a more "clever" encoding, based on the structure of the problem you're dealing with (thanks @M Oehm for the remark). However, it is a good idea to learn about various ways of representing graphs. Adjacency matrices are OK only for dense graphs. Otherwise, use a different encoding that will save you space.

Answer 3

I tried matrix, but it is just too big, I suppose, because my program will stop working

You should always allocate large amounts of data dynamically, by using malloc.

All desktop OS work like this: every process (your program) is given an amount of static memory ( .data and .bss ) and a stack ( .stack ). The static data is where your globals/static variables go. The stack is where all local (automatic) variables go.

The static memory and the stack are both relatively small, we are talking kilobytes rather than megabytes. If you either allocate too many static variables or too many automatic variables, you'll run out of memory for your process.

Therefore, all large chunks of memory like your matrix, needs to be allocated on the heap , which is done through dynamic allocation. Not necessarily because you don't know the size of the matrix in run-time, but because the memory allocated on the heap has no size restrictions and is shared between all processes on the computer. The upper limit is the physical RAM of the OS or computer.

(The reason it works in Pascal might be because you are using Object Pascal/Delphi, which probably allocates the data on the heap for you. For example, the whole Delphi VCL library uses only dynamic allocation, for the above mentioned reasons.)

Answer 4

如果您拥有具有多个内核的GPU，则即使在复杂性更高的情况下也可以正常工作。请参阅Nvidia CUDA C编程指南，并阅读有关LAPCK和BLAZE-LIB的信息，以更好地处理顶点。

How to represent graph with million vertices in C?

Question

4 answers

solution1
5 2015-05-11 12:32:48

solution2
4 2015-05-11 12:28:50

solution3
1 2015-05-11 13:56:59

solution4
0 2015-05-11 12:43:59

How to represent graph with million vertices in C?

Question

4 answers

solution1 5 2015-05-11 12:32:48

solution2 4 2015-05-11 12:28:50

solution3 1 2015-05-11 13:56:59

solution4 0 2015-05-11 12:43:59

solution1
5 2015-05-11 12:32:48

solution2
4 2015-05-11 12:28:50

solution3
1 2015-05-11 13:56:59

solution4
0 2015-05-11 12:43:59