简体   繁体   中英

Array length in C language

I am a previous C# programmer and there is something I can't understand regarding C language (Specifically, I am coding using the C99 Standard)

I was taught that there is no way to know the length of an array in C and that I need to send its length as a parameter to the function I am willing to use but why is that? in C# for example we can type array_name.lenght

plus in two dimensional arrays why do I have to specify the number of columns of the array? I mean why this work:

void test1 (int arr[][m])
{
}

while this doesn't:

void test2 (int arr[][])
{
}

in C# for example we can type array_name.length

I do not use C#, but, if, in a subroutine you can get the length of an array created elsewhere, then information about that length had to be stored in memory and passed along with the array. Something had to put that length in memory, and, when the array was passed as an argument, something had to include information more than just the length of the array. So C# is using memory and computing time.

A consequence of this is you do not have direct control of the computer. You cannot write a simpler more efficient program as long as something is passing extra information. It is necessarily wasteful. That is fine as long as you are writing programs in situations where plenty of resources are available.

C does not make this extra effort. When an array is passed, only its location is passed, and that is all you need to access its elements. If a particular subroutine needs its length, you can pass that manually—it is your choice to do it when you need to, but you also have the option not to waste resources when you do not need them. You can write more efficient programs.

in two dimensional arrays why do I have to specify the number of columns of the array?

If we know arr is an array of int , we know element arr[0] is at the start, arr[1] is right after that, arr[2] is after that, and so on. To use a one-dimensional array, the only thing we need to know is where it starts.

If we know array is a two-dimensional array of int , we know a[0][0] is at the start, arr[0][1] is after that, and so on, but we do not know where arr[1][0] is. It is after some number of elements arr[0][i] , but we do not know how many unless we know the second dimension. Therefore, in order to use a two-dimensional array, you must know the length of the second dimension. That is a logical requirement, not a choice.

Supplement

Generally, a routine only needs to know which elements of an array it is supposed to use. It does not need to know how many elements there are in the array.

Situations in which a routine does not need to be given the length of an array include:

  • To calculate the length of a string in a buffer, a routine (like strlen ) only needs to examine each byte in the buffer until it finds a null byte. It does not need to know how big the entire buffer is. (Example: A program creates a buffer of 100 bytes. It reads bytes from the terminal until a new-line is found. The user types only 12 characters and then a new-line. The buffer is filled with 12 bytes and a null character. A subroutine examining the string only needs to work with 13 bytes, not 100.)
  • A routine might work on a fixed number of elements. For example, a subroutine to help with numerical integration might take three function values at one time, fit a curve to them, and return the area under the curve. The main routine might have an entire array of function values, and it repeatedly calls the subroutine to evaluate different points in the array, passing the subroutine a pointer to the location to work on. In each call, the subroutine only needs to know there are three values for it at the given address. It does not need to know how many are in the full array.
  • A routine might work on the same number of elements in multiple arrays. For example, a routine to perform a Discrete Fourier Transform might take a number of elements N to work on and four arrays: one for input of the real components, one for input of the imaginary components, one for output of the real components, and one for output of the imaginary components. For each of the arrays, the routine uses N elements. This number N only needs to be passed to the routine in one parameter. It would wasteful to store it in multiple locations, one for each array.

Another consideration is that sometimes we pass only part of an array to a routine. If I have some string in a buffer, I might want a subroutine to work on only part of that string, perhaps just one word in a command that has been parsed. To do this, I can pass just a pointer to the start of that word and the length of the word to work on. In this case, the subroutine not only does not need to know the length of the array, it does not even need to know where the array starts. It only needs to know what it is asked to work on. It would be wasteful to pass any other information.

In most programming languages, data types are abstractions : that is, if you ask for a list of numbers, it will create structures in memory for storing a list of numbers, and for keeping track of its capacity, how many elements are full, and perhaps whether the elements are "null" or contain values, etc.

C is a low-level language that doesn't deal in abstractions; it deals directly with physical memory. If you ask for space to put 5 integers, it allocates memory for 5 integers. You wanted it to keep track of the number "5" somewhere to remember that you allocated 5 integers? You didn't ask for that--you'll have to do that yourself.

In C an array passed as a parameter to a function is converted to a pointer to the first element of the array. The size of the array is not implicitly passed to the function. You, the programmer, are responsible for passing the correct array size to your function.

int sum(int *num, size_t length)
{
   int total = 0;
   int i;
   for (i = 0; i < length; i++)
   {
      total += num[i];
   }
}

One of the problems with this approach is the parameter for the array is only assumed to point to an array. It could point to any int, whether or not that int is an element of an array. If this mistake is made a classical buffer overflow occurs.

C is a Procedural language (and closer to assembler than most Procedural languages), not an Object Oriented language. IOW, Algol (and C) came way before Smalltalk (and C#), and Smalltalk taught us some important lessons.

Sometimes you can use the following in C:

#define num_elements(array) (sizeof(array) / sizeof(array[0]))

...but when an array has been passed to a function, that often doesn't work anymore.

Another good way, that works in almost any situation in C, is to:

#define MY_ARRAY_ELEMENTS 1000
int a[MY_ARRAY_ELEMENTS];
foo(a, MY_ARRAY_ELEMENTS);

IOW, define a symbolic constant for the length of a particular array, and use that instead of hardcoding constants.

OO languages have metadata associated with objects anyway, so why not store a length in the metadata? C doesn't do that sort of thing though - it was created in a time when bytes were precious, and metadata was seen as too much overhead.

And why do you have to partially define the size of an n dimensional array? Because behind the scenes C is doing some math to multiply out where in memory a[x][y] exists, and again, it's not storing metadata to help you keep track of those dimensions.

Consider that Pascal, another Procedural language, made the array dimensions part of an array's type . That was kind of the opposite extreme - the size and shape were kept track of in the type system, but was actually pretty draconian to use in practice. So writing a function to sum the floats in two different arrays of two different lengths was impractical.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM