简体   繁体   English

将字符与指针用于“单词数组”

[英]Using character vs. pointer for an "array of words"

Let's say I have a paragraph and I want to split up all the words and put them in an array.假设我有一个段落,我想拆分所有单词并将它们放在一个数组中。 What would be a better way to do it (for this example, let's assume 100 words all under length 20chars):有什么更好的方法可以做到(在这个例子中,我们假设 100 个单词的长度都在 20 个字符以下):

# character array
char our_array[100][20];
strcpy(our_array[0], "Hello";
strcpy(our_array[1], "Something");

Or:或者:

# string (pointer) array
char *newer_string[100];
newer_string[0] = "Hello";
newer_string[1] = "Something";

Why would one be preferable over the other?为什么一个比另一个更可取? And is one more common in practice than the other?在实践中,一种比另一种更常见吗?

Version 2 does not occupy all the memory at once (as Version 1 would).版本 2 不会一次占用所有内存(就像版本 1 一样)。 Also the length of your strings can be arbitrary, you are not bound to a specific length (eg 20).此外,您的字符串长度可以是任意的,您不受特定长度的限制(例如 20)。 You may get a small memory overhead in Version 2 (due to the pointers you need to save), but this is only really true if at least nearly all words are used and nearly all of them have the specified length.在版本 2 中,您可能会获得少量内存开销(由于您需要保存指针),但只有在至少使用了几乎所有单词并且几乎所有单词都具有指定长度的情况下,这才是真的。

In general, I would always recommend Version 2 (Array of strings/char pointers).一般来说,我总是推荐版本 2(字符串/字符指针数组)。 It is even easier to replace strings in this 1D-Array than in the 2D Version.在这个一维数组中替换字符串比在二维版本中更容易。

It depends on what you want to do with that variable.这取决于您想对该变量做什么。 This is definitely not written in stone, and there are little guidelines.这绝对不是一成不变的,而且几乎没有指导方针。

The first option...第一个选项...

char our_array[100][20];
strcpy(our_array[0], "Hello";
strcpy(our_array[1], "Something");

...has the advantage that each element of our_array is actually an array of char . ...的优点是our_array每个元素实际上都是一个char数组。 So you can modify that data.因此您可以修改该数据。 Those strings are not read-only.这些字符串不是只读的。

On the other hand, you are limited to strings of 19 characteres, and it's quite easy to fumble that.另一方面,您只能使用 19 个字符的字符串,而且很容易摸索。 Because you are using strcpy() to initialize that array of strings, any error you make will not be detected by the compiler.因为您使用strcpy()来初始化该字符串数组,所以编译器不会检测到您犯的任何错误。

The other option...另一种选择...

char *newer_string[100];
newer_string[0] = "Hello";
newer_string[1] = "Something";

...has the advantage that each element of the array is a pointer. ...的优点是数组的每个元素都是一个指针。 The strings are kept in read-only, static storage.字符串保存在只读的静态存储中。 They ocupy less space, and you can easy change newer_string[i] to point to something else.它们占用的空间更少,您可以轻松更改newer_string[i]以指向其他内容。 However, you cannot modify that data.但是,您不能修改该数据。

Option 1 will assign a fixed 2D array of 100*20.选项 1 将分配一个 100*20 的固定二维数组。 In this case, the strings are stored in the 2D array.在这种情况下,字符串存储在二维数组中。 This has the following features.这有以下特点。

  • Fixed storage per string.每个字符串的固定存储。 If eg one name is 50 characters long, the array needs to be 100*50.例如,如果一个名称的长度为 50 个字符,则数组必须为 100*50。
  • The array is mutable.数组是可变的。 ie you can change the elements easily.即您可以轻松更改元素。
  • No requirement of heap memory allocation ( malloc/calloc )不需要堆内存分配( malloc/calloc
  • If there is a need of sorting the names, this method requires copying the whole strings around and is inefficient.如果需要对名称进行排序,这种方法需要将整个字符串复制过来,效率低下。

Option 2, as you have shown only works for constant string allocations at compile time.如您所见,选项 2 仅适用于编译时的常量字符串分配。 If you want to read the string from a file or from the user you need to dynamically allocate the memory.如果要从文件或用户读取字符串,则需要动态分配内存。 Something like shown below.如下图所示。

char *newer_string[100];
char stringtemp[101];   // size this to the maximum string len you need to support.
int len;
for (i=0; i<100; i++)
{
    scanf("%s",stringtemp);
    len = strlen(stringtemp);
    newer_string[i] = malloc(len+1);
    if (newer_string[i] == NULL) { /*handle memory error*/ }
    strcpy(newer_string[i], stringtemp);
}

The features here are这里的特点是

  • More effecient memory storage.更有效的内存存储。 eg if one string is long, only that array element has more memory例如,如果一个字符串很长,只有那个数组元素有更多的内存
  • Needs dynamic memory allocation.需要动态内存分配。 So you also need to take care of free所以你也需要free照顾
  • Easier to sort.更容易排序。 For a sorting algorithm, you need to only swap the pointers newer_string[i] and newer_string[i+1]对于排序算法,您只需要交换指针newer_string[i]newer_string[i+1]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM