简体   繁体   English

使用C中的字符串成员复制结构

[英]Copy a struct with a string member in C

I have a simple struct containing a string defined as a char array. 我有一个简单的结构,包含一个定义为char数组的字符串。 I thought that copying an instance of the struct to another instance using the assignment operator would simply copy the memory address stored in the char pointer. 我认为使用赋值运算符将结构的实例复制到另一个实例只会复制存储在char指针中的内存地址。 Instead it seems that the string content is copied. 相反,似乎复制了字符串内容。 I put together a very simple example: 我把一个非常简单的例子放在一起:

#include <stdio.h>
#include <string.h>

struct Test{
  char str[20];
};

int main(){

  struct Test t1, t2;
  strcpy(t1.str, "Hello");
  strcpy(t2.str, "world");
  printf("t1: %s %p\n", t1.str, (char*)(t1.str));
  printf("t2: %s %p\n", t2.str, (char*)(t2.str));
  t2 = t1;
  printf("t2: %s %p\n", t2.str, (char*)(t2.str));
  return 0;
}

Compiling this code with gcc 4.9.2 I get: 使用gcc 4.9.2编译此代码我得到:

t1: Hello 0x7fffb8fc9df0
t2: world 0x7fffb8fc9dd0
t2: Hello 0x7fffb8fc9dd0

As I understand, after t2 = t1 t2.str points to the same memory address it pointed before the assignment, but now inside that address there is the same string found inside t1.str. 据我所知,在t2 = t1 t2.str指向它在赋值之前指向的相同内存地址,但是现在在该地址内部,在t1.str中找到了相同的字符串。 So it seems to me that the string content has been automatically copied from one memory location to another, something that I thought C would not do. 所以在我看来,字符串内容已经自动从一个内存位置复制到另一个内存位置,这是我认为C不会做的事情。 I think that this behaviour is triggered by the fact that I declared str as a char[] , not as a char* . 我认为这种行为是由我将str声明为char[]而不是char*的事实触发的。 Indeed, trying to assign directly one string to another with t2.str = t1.str gives this error: 实际上,尝试使用t2.str = t1.str直接将一个字符串分配给另一个字符串会出现此错误:

Test.c: In function ‘main’:
Test.c:17:10: error: assignment to expression with array type
   t2.str = t1.str;
      ^

which makes me think that arrays are effectively treated differently than pointers in some cases. 这让我觉得在某些情况下,数组的处理方式与指针不同。 Still I can't figure out which are the rules for array assignment, or in other words why arrays inside a struct are copied when I copy one struct into another one but I can't directly copy one array into another one. 我仍然无法弄清楚数组赋值的规则是什么,或者换句话说,当我将一个结构复制到另一个结构时,为什么复制结构中的数组,但是我不能直接将一个数组复制到另一个数组中。

The structure contains no pointer, but 20 chars. 该结构不包含指针,但包含20个字符。 After t2 = t1 , the 20 chars of t1 are copied into t2 . t2 = t1 ,的20个字符t1被复制到t2

In C a struct is a way for the compiler to know how to structure an area of memory. 在C中, struct是编译器知道如何构造内存区域的一种方式。 A struct is a kind of template or stencil which the C compiler uses to figure out how to calculate offsets to the various members of the struct. struct是一种模板或模板,C编译器使用它来计算如何计算结构的各个成员的偏移量。

The first C compilers did not allow struct assignment so people had to use a memcpy() function to assign structs however later compilers did. 第一个C编译器不允许struct赋值,所以人们必须使用memcpy()函数来分配结构,但是后来编译器会这样做。 AC compiler will do a struct assignment by copying the number of bytes of the struct area of memory, including padding bytes that may be added for address alighnment from one address to another. AC编译器将通过复制内存struct区域的字节数来执行struct分配,包括可以为从一个地址到另一个地址的地址附加而添加的填充字节。 Whatever happens to be in the source memory area is copied to the destination area. 无论在源存储区中发生什么,都会被复制到目标区域。 There is nothing smart done about the copy. 关于副本没有什么聪明的做法。 It is just copy so many bytes of data from one memory location to another. 它只是从一个内存位置复制到另一个内存位置的这么多字节的数据。

If you have a string array in the struct or any kind of an array then the entire array will be copied since that is part of the struct. 如果struct有字符串数组或任何类型的数组,那么整个数组将被复制,因为它是结构的一部分。

If the struct contains pointer variables then those pointer variables will also be copied from one area to another. 如果struct包含指针变量,那么这些指针变量也将从一个区域复制到另一个区域。 The result of this is that you will have two structs with the same data. 结果是你将拥有两个具有相同数据的结构。 The pointer variables in each of those structs will have similar address values, the two areas being a copy of each other, so a particular pointer in one struct will have the same address as the corresponding pointer in the other struct and both will be pointing to the same location. 每个结构中的指针变量将具有相似的地址值,这两个区域是彼此的副本,因此一个结构中的特定指针将具有与另一个结构中相应指针相同的地址,并且两者都将指向同一个地方。

Remember that a struct assignment is just copying bytes of data from one area of memory to another. 请记住,结构赋值只是将数据字节从一个内存区域复制到另一个区域。 For instance if we have a simple struct with a char array with the C source looking like: 举例来说,如果我们有一个简单的structchar数组的C源看起来像:

typedef struct {
    char tt[50];
} tt_struct;

void test (tt_struct *p)
{
    tt_struct jj = *p;

    tt_struct kk;

    kk = jj;
}

The assembler listing output by the Visual Studio 2005 C++ compiler in debug mode for the assignment of kk = jj; Visual Studio 2005 C ++编译器在调试模式下输出的汇编程序列表,用于分配kk = jj; looks like: 看起来像:

; 10   :    tt_struct kk;
; 11   : 
; 12   :    kk = jj;

  00037 b9 0c 00 00 00   mov     ecx, 12            ; 0000000cH
  0003c 8d 75 c4     lea     esi, DWORD PTR _jj$[ebp]
  0003f 8d 7d 88     lea     edi, DWORD PTR _kk$[ebp]
  00042 f3 a5        rep movsd
  00044 66 a5        movsw

This bit of code is copying data 4 byte word by 4 byte word from one location in memory to another. 这段代码将4字节字的数据从内存中的一个位置复制到另一个位置。 With a smaller char array size, the compiler may opt to use a different series of instructions to copy the memory as being more efficient. 使用较小的char数组大小,编译器可以选择使用不同系列的指令来复制内存以提高效率。

In C arrays are not really handled in a smart way. 在C数组中并没有真正以智能方式处理。 An array is not seen as a data structure in the same way that Java sees an array. 数组不像Java看到数组那样被视为数据结构。 In Java an array is a type of object composed of an array of objects. 在Java中,数组是一种由对象数组组成的对象。 In C an array is just a memory area and the array name is actually treated like a constant pointer or a pointer that can not be changed. 在C中,数组只是一个内存区域,数组名称实际上被视为常量指针或无法更改的指针。 The result is that in C you can have an array say int myInts[5]; 结果是在C中你可以有一个数组说int myInts[5]; which Java would see as an array of five ints however to C that is really a constant pointer with a label of myInts . 哪个Java会看到一个由五个整数组成的数组,但对于C来说,它实际上是一个带有myInts标签的常量指针。 In Java if you try to access an array element out of range, say myInts[i] where i is a value of 8, you will get a runtime error. 在Java中,如果您尝试访问超出范围的数组元素,请说myInts [i],其中i是值8,您将收到运行时错误。 In C if you try to access an array element out of range, say myInts[i] where i is a value of 8, you will not get a runtime error unless you are working with a debug build with a nice C compiler that is doing runtime checks. 在C中,如果你试图访问超出范围的数组元素,比如myInts [i],其中i是值8,除非你正在使用一个正在做的好的C编译器的调试版本,否则你不会得到运行时错误运行时检查。 However experienced C programmers have a tendency to treat arrays and pointers as similar constructs though arrays as pointers do have some restrictions since they are a form of a constant pointer and aren't exactly pointers but have some characteristics similar to pointers. 然而,有经验的C程序员倾向于将数组和指针视为类似的结构,尽管作为指针的数组确实有一些限制,因为它们是常量指针的一种形式,并不是指针,但具有与指针类似的一些特性。

This kind of buffer overflow error is very easy to do in C by accessing an array past its number of elements. 通过访问超过其元素数量的数组,在C中很容易实现这种缓冲区溢出错误。 The classic example is doing a string copy of a char array into another char array and the source char array does not have a zero termination character in it resulting in a string copy of a few hundred bytes when you expect ten or fifteen. 经典示例是将char数组的字符串副本写入另一个char数组,并且源char数组中没有零终止字符,当您预期十或十五时,会产生几百个字节的字符串副本。

There are really 20 characters in your case, it same as if you declare the struct as struct Test {char c1, char c2, ...} 在你的情况下,实际上有20个字符,就像你将结构声明为struct Test {char c1, char c2, ...}

If you want to copy only pointer to the string, you can change the struct declaration as below and manually manage the memory for the string via functions Test_init and Test_delete . 如果只想复制指向字符串的指针,可以按如下所示更改结构声明,并通过函数Test_initTest_delete手动管理字符串的内存。

struct Test{
  char* str;
};

void Test_init(struct Test* test, size_t len) {
  test->str = malloc(len);
}

void Test_delete(struct Test* test) {
  free(test->str);
}

If you run the following simple program 如果您运行以下简单程序

#include <stdio.h>

int main( void )
{
    {
        struct Test
        {
            char str[20];
        };
        printf( "%zu\n", sizeof( Test ) );
    }

    {
        struct Test
        {
            char *str;
        };
        printf( "%zu\n", sizeof( Test ) );
    }
    return 0;
}

you will get a result similar to the following 您将得到类似于以下的结果

20
4

So the first structure contains a character array of 20 elements while the second structure contains only a pointer of type char * . 因此第一个结构包含20个元素的字符数组,而第二个结构只包含char *类型的指针。

When one structure is assigned to another structure its data members are copied. 将一个结构分配给另一个结构时,将复制其数据成员。 So for the first structure all content of the array is copied in another structure. 因此,对于第一个结构,数组的所有内容都复制到另一个结构中。 For the second structure only the value of the pointer (the address it contains) is copied. 对于第二个结构,仅复制指针的值(它包含的地址)。 The memory pointed to by the pointer is not copied because it is not contained in the structure itself. 指针指向的内存不会被复制,因为它不包含在结构本身中。

And arrays are not pointers though usually names of arrays in expressions (with rare exceptions) are converted to pointers to their first elements. 并且数组不是指针,但通常表达式中的数组名称(极少数例外)会转换为指向其第一个元素的指针。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM