Dynamic memory and pointers arguments

Question

I have this two functions, meant to do the same thing - read every line from a file with only integers and store them on an array:

I call them like this on the main() function:

StoreinArray1(X, size, f);

StoreinArray2(X, size, f);

The First works but the Second doesn't.

First

int StoreinArray1(int X[], int *size, char *file)
{
    int i=0;
    FILE *f;
    f = fopen(file, "r");

    X = (int*) realloc (X, *size * sizeof(int));

for (i=0;i<*size;i++)
{
    fscanf(f, "%d", &X[i]);
}

return 1;
}

Second

int StoreinArray2(int X[], int *size, char *file)
{
  FILE *f;
  f = fopen(file, "r");
  if (f == NULL)
     return -1;  // failed opening
   *size = 0;
   while (!feof(f))
   {
     if (fscanf(f, "%d", &X[*size]) == 1)
       *size = *size + 1;
   }
   fclose(f);
   return 1;
}

For the First I used dynamic memory allocation and actually calculated size:

 X = malloc(0); 

 while ((ch = fgetc(f)) != EOF)
{
    if (ch == '\n')
    lines++;
}

size = &lines;

For the Second I can't do the same. Visual Studio Code crashes when I try.

So I tried to do *size = 0 and then StoreinArray2(X, size, f); but it didn't work either.

So my question is about the second function:

Is it calculating the size while it is scanning the file? Supposedly it isn't necessary to use dynamic memory allocation (my teacher said).

If so then how can I pass some "size" argument correctly? As a pointer or just a simple integer?

Thank you in advance!

Edit:

Here is the full First program:

#include <stdio.h>
#include <stdlib.h>

int main()
{
FILE *f;
int *size=0, *X, lines=1;
char *file = {"file.txt"};
char ch;

X = malloc(0);

f = fopen(file, "r");

while ((ch = fgetc(f)) != EOF)
{
    if (ch == '\n')
    lines++;
}

size = &lines;

  StoreinArray(X, size, file);

}

int StoreinArray(int X[], int *size, char *file)
{
int i=0;
FILE *f;
f = fopen(file, "r");

X = (int*) realloc (X, *size * sizeof(int));

for (i=0;i<*size;i++)
{
    fscanf(f, "%d", &X[i]);
}

for (i=0;i<*size;i++)
    printf("%d\n",X[i]);

return 1;
}

And the Second:

int main()
{
    int X[100];
    int *size;
  char *file = {"file.txt"};

  *size = 0;

  StoreinArray(X, size, file);
}
int StoreinArray(int X[], int *size, char *file)
{
  FILE *f;
  f = fopen(file, "r");
  if (f == NULL)
    return -1;
  *size = 0;
  while (!feof(f))
 {
    if (fscanf(f, "%d", &X[*size]) == 1)
      *size = *size + 1;
  }
  fclose(f);
  return 1;
}

In first I had to open the file in main to count the number of lines. I know I forgot fclose(f) and free(X) in main, but with those instructions VSC crashes.

  int StoreinArray (int X[], int *size, char *file)
    {
      FILE *f;
      int i=0;
      f = fopen(file, "r");
       if (f == NULL)
       return -1;
     *size = 0;
      while (!feof(f))
      {
        if (fscanf(f, "%d", &X[*size]) == 1)
        {
           *size = *size + 1;
           X = (int *) realloc (X , *size * sizeof(int));
        }
      }
      fclose(f);
      return 1;
      }

       int main() 
   {
        int *X, size=0;
        char *file = {"f.txt"};
        X=malloc(0);
        StoreinArray(X, &size, file);
        free(X); 
   }

Answer 1

The problem with the second version of your program is the declaration of size in main. Declare it as an int, not a pointer to an int. Your current program is crashing because you didn't allocate any space for size, and when StoreInArray tried to update it, you got an access violation. So, main should look like this:

int main()
{
    int X[100];
    int size;
    char *file = {"file.txt"};

    size = 0;

    StoreinArray(X, &size, file);
}

Answer 2

OK, I'll try to be through about it and explain all the things I can find.

First of all, we need to talk about variables, pointers and memory, because it seems that you don't have a very firm grasp of these concepts. Once that clicks, the rest should follow easily.

First up, simple variables. That part is easy, I think you more or less understand that.

int x; // This is an integer variable. It's not initialized, so its value could be anything
int meaning = 42; // This is another integer variable. Its value will be 42.
double pi = 3.14; // A variable with digits after the decimal point
char c = 15; // Another inte... well, yes, actually, char is also an integer.
char c2 = 'a'; // Nice, this also counts. It's converted to an integer behind the scenes.

Etc.

Similarly with arrays:

int arr[10]; // Array with 10 values. Uninitialized, so they contain garbage.
int arr2[3] = { 1, 2, 3 }; // Array with 3 values, all initialized
int arr3[] = {1, 2, 3, 4, 5}; // Array with 5 values.

Arrays are basically just a bunch of variables created at once. When you make an array, C needs to know the size, and the size must be a fixed number - you can't use another variable. There's a reason for this, but it's technical and I won't go into that.

Now about memory. Each of these variables will be stored somewhere in your computer's RAM. The precise location is unpredictable and can vary each time you run your program.

Now, RAM is like a huuuge array of bytes. There's byte number 0 , byte number 1 , etc. An int variable takes up 4 bytes, so it could, for example, end up in bytes number 120 , 121 , 122 and 123 .

All the bytes in a single variable (or in a single array) will be next to each other in RAM. Two different variables could end up in the opposite ends of your RAM, but the bytes in each of those variables will be together.

Now we come to the concept of a pointer. A pointer is basically just an integer variable. It contains the RAM-number of the first byte of some other variable. Let's look at an example:

int i = 42;
int *p = &i;

Suppose that the variable i got stored in the bytes number 200 ... 203 (that's 4 bytes). In those bytes we have the value 42 . Then let's suppose that the variable p got stored in the bytes number 300 ... 303 (that's another 4 bytes). Well, these 4 bytes will contain the value 200 , because that's the first byte of the i variable.

This is also what programmers mean when they say "(memory) address of <variable> " or "a pointer to <variable> . It's the number of the first byte in RAM of <variable> . Since all bytes of the same variable stick together, then by knowing the first byte (and knowing the type of the variable), you can figure out where the rest of <variable> is in memory.

Now let's add one more line in our example:

*p = 5;

What the computer does in this case is it takes the address which was stored in p , goes to that location in memory, treats the following 4 bytes as an integer, and puts the value 5 there. Since we had previously set p to "point" at the address of i , this has the same effect as simply setting the i variable itself.

Ok, did you get all of this? This is a bit tricky and usually takes a while to wrap your head around it. Feel free to re-read it as many times as necessary to understand it. You'll need it to move on.

Ready? Ok, let's talk a bit about stack and dynamic memory.

When the program starts, the OS automatically allocates a bit of memory for it, just to make it easier for it to start. It's like a big array of bytes, all together in memory. Nowadays it's usually about 1MB, but it can vary. This memory is called "The Stack". Why is it called that? Eh, I'll explain some other time.

Anyways, When your main() function starts, the OS goes like "Here you go my good fellow, a pointer to The Stack. It's all yours to use as you see fit! Have a nice day!"

And your main() function then uses it to store all the variables you make in it. So when you go like p = &i; then the address you get stored in p is somewhere within The Stack.

Now when main() calls another function, such as StoreinArray() , it also gives it a pointer to the stack and says "OK, here's a pointer to the stack. Careful, I've already used the first XXX bytes of it, but feel free to use the rest".

And then StoreinArray() uses the stack to put its variables there. And when StoreinArray() calls something else, it does the same, and on and on.

Now, there are a few things to note here:

This scheme is nice, because nobody needs to "allocate" or "deallocate" any memory. It's easier, it's faster.
However, when a function returns, all its variables are considered to be gone and that memory is fair game to anyone else later wanting it. So be careful about pointers pointing to it. They will only be valid while the function is running. When it returns - well, the pointer will still "work", but who knows when something will overwrite that memory... And if you write to that memory, who can tell what you will mess up? Many subtle bugs have been created this way.
The stack is fairly limited and can be exhausted. When all bytes in it are used up, your program WILL crash. The typical way this happens is either when you try to create a very large array, or you go into some sort of infinite loop where the function keeps calling itself over and over again. Try that. :)

So, for these cases you use "dynamic" memory. In C that mostly means malloc() . You tell malloc() how many bytes of memory you need, and malloc() finds a large enough unclaimed space in the RAM, marks it as used, and gives you a pointer to it. Well, that's a simplified view of things anyway.

The same approach works when you don't know beforehand how much memory you will need.

The downside is that you need to free() the memory when you're done with it, or you may run out of available memory, and then malloc() will fail. Be also aware that after you've freed the memory, all pointers to it should be considered invalid, because you're not the owner of that particular piece of memory anymore. Anything can happen if you keep messing with it.

Phew, that's a lot. OK, I need a break. I'll come back and analyze your programs a bit later. But if you've understood all of this, you should now be able to spot the mistakes in your programs. Try going through them, line by line, and narrating to yourself what each line does.

Many, many hours later:

OK, so let's take a look at your programs. The first one:

#include <stdio.h>
#include <stdlib.h>

int main()
{
FILE *f;
int *size=0, *X, lines=1;
char *file = {"file.txt"};
char ch;

X = malloc(0);

f = fopen(file, "r");

while ((ch = fgetc(f)) != EOF)
{
    if (ch == '\n')
    lines++;
}

size = &lines;

  StoreinArray(X, size, file);

}

int StoreinArray(int X[], int *size, char *file)
{
int i=0;
FILE *f;
f = fopen(file, "r");

X = (int*) realloc (X, *size * sizeof(int));

for (i=0;i<*size;i++)
{
    fscanf(f, "%d", &X[i]);
}

for (i=0;i<*size;i++)
    printf("%d\n",X[i]);

return 1;
}

There are two things that can be improved here. First - the size and lines variables. There's no need for both of them. Especially since you set size to point to lines anyway. Just keep lines and all will be fine. When you pass it to StoreinArray() , pass it as a simple integer. There's no need for a pointer.

Second, the X array. You're doing something odd with it and it seems like you're fumbling in the dark. There's no need for the malloc(0) and then later realloc(X, *size*sizeof(int) . Keep it simple - first count the lines, then allocate the memory (as much as needed). Also, keep the memory allocation in the main() method and just pass the final X to the StoreinArray . This way you can avoid another subtle bug - when inside the StoreinArray() function you execute the line X = (int*) realloc (X, *size * sizeof(int)); the value of X changes only inside the StoreinArray() function. When the function returns, the variable X in the main() function will still have its old value. You probably tried to work around this with the reallocate() dance, but that's not how it works. Even worse - after the realloc() , whatever value the X used to be, isn't a valid pointer anymore, because realloc() freed that old memory! If you had later tried to do anything with the X variable in the main() function, your program would have crashed.

Let's see how your program would look with the changes I proposed (plus a few more small cosmetic tweaks):

#include <stdio.h>
#include <stdlib.h>

int main()
{
    char *file = "file.txt";
    FILE *f = fopen(file, "r");
    int *X, lines=1;
    char ch;

    while ((ch = fgetc(f)) != EOF)
    {
        if (ch == '\n')
            lines++;
    }
    fclose(f);

    X = (int *)malloc(lines * sizeof(int));

    StoreinArray(X, lines, file);
}

void StoreinArray(int X[], int lines, char *file)
{
    int i=0;
    FILE *f = fopen(file, "r");

    for (i=0;i<lines;i++)
    {
        fscanf(f, "%d", &X[i]);
    }
    fclose(f);

    for (i=0;i<lines;i++)
        printf("%d\n",X[i]);
}

OK, now the second program.

int main()
{
    int X[100];
    int *size;
  char *file = {"file.txt"};

  *size = 0;

  StoreinArray(X, size, file);
}
int StoreinArray(int X[], int *size, char *file)
{
  FILE *f;
  f = fopen(file, "r");
  if (f == NULL)
    return -1;
  *size = 0;
  while (!feof(f))
 {
    if (fscanf(f, "%d", &X[*size]) == 1)
      *size++;
  }
  fclose(f);
  return 1;
}

Right off the bat, the size variable will crash your program. It's a pointer that isn't initialized so it points to some random place in memory. When a little lower you try to write to the memory it points to ( *size = 0 ), that will crash, because most likely you won't own that memory. Again, you really don't need a pointer here. In fact, you don't need the variable at all. If you need to know in the main program how many integers the StoreinArray() read, you can simply have it return it.

There's another subtle problem - since the size of X array is fixed, you cannot afford to read more than 100 integers from the file. If you do, you will go outside the array and your program will crash. Or worse - it won't crash, but you'll be overwriting the memory that belongs to some other variable. Weird things will happen. C is lenient, it doesn't check if you're going outside the allowed bounds - but if you do, all bets are off. I've spent many hours trying to find the cause of a program behaving weirdly, only to find out that some other code in some completely unrelated place had gone outside its array and wreaked havoc upon my variablers. This is VERY difficult to debug. Be VERY, VERY careful with loops and arrays in C.

In fact, this kind of bug - going outside an array - has it's own name: a "Buffer Overrun". It's a very common security exploit too. Many security vulnerabilities in large, popular programs are exactly this problem.

So, the best practice would be to tell StoreinArray() that it can store at most 100 integers in the X array. Let's do that:

#include <stdio.h>
#include <stdlib.h>

#define MAX_X 100

int main()
{
    int X[MAX_X];
    char *file = "file.txt";
    int lines;

    lines = StoreinArray(X, MAX_X, file);
}
int StoreinArray(int X[], int maxLines, char *file)
{
    FILE *f;
    int lines;

    f = fopen(file, "r");
    if (f == NULL)
        return -1;

    while (!feof(f))
    {
        if (fscanf(f, "%d", &X[lines]) == 1)
            lines++;
        if (lines == maxLines)
            break;
    }
    fclose(f);
    return lines;
}

So, there you are. This should work. Any more questions? :)

Dynamic memory and pointers arguments

Question

2 answers

solution1
1 2018-03-26 23:20:50

solution2
0 2018-03-27 09:18:42

Dynamic memory and pointers arguments

Question

2 answers

solution1 1 2018-03-26 23:20:50

solution2 0 2018-03-27 09:18:42

solution1
1 2018-03-26 23:20:50

solution2
0 2018-03-27 09:18:42