简体   繁体   English

实现一个新的strcpy函数重新定义了库函数strcpy?

[英]Implementing a new strcpy function redefines the library function strcpy?

It is said that we can write multiple declarations but only one definition. 据说我们可以编写多个声明但只能编写一个定义。 Now if I implement my own strcpy function with the same prototype : 现在,如果我使用相同的原型实现我自己的strcpy函数:

char * strcpy ( char * destination, const char * source );

Then am I not redefining the existing library function? 那么我不是在重新定义现有的库函数吗? Shouldn't this display an error? 这不应该显示错误吗? Or is it somehow related to the fact that the library functions are provided in object code form? 或者它是否以某种方式与库函数以目标代码形式提供的事实有关?

EDIT: Running the following code on my machine says "Segmentation fault (core dumped)". 编辑:在我的机器上运行以下代码说“分段故障(核心转储)”。 I am working on linux and have compiled without using any flags. 我正在使用linux并且已经编译而没有使用任何标志。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *strcpy(char *destination, const char *source);

int main(){
    char *s = strcpy("a", "b");
    printf("\nThe function ran successfully\n");
    return 0;
}

char *strcpy(char *destination, const char *source){
    printf("in duplicate function strcpy");
    return "a";
}

Please note that I am not trying to implement the function. 请注意,我不是要尝试实现该功能。 I am just trying to redefine a function and asking for the consequences. 我只是想重新定义一个函数并询问后果。

EDIT 2: After applying the suggested changes by Mats, the program no longer gives a segmentation fault although I am still redefining the function. 编辑2:在应用Mats建议的更改后,程序不再提供分段错误,尽管我仍在重新定义函数。

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

char *strcpy(char *destination, const char *source);

int main(){
    char *s = strcpy("a", "b");
    printf("\nThe function ran successfully\n");
    return 0;
}

char *strcpy(char *destination, const char *source){
    printf("in duplicate function strcpy");
    return "a";
}

C11(ISO/IEC 9899:201x) §7.1.3 Reserved Identifiers C11(ISO / IEC 9899:201x)§7.1.3 保留标识符

— Each macro name in any of the following subclauses (including the future library directions) is reserved for use as specified if any of its associated headers is included; - 如果包含任何相关标头,则保留以下任何子条款中的每个宏名称(包括未来的库方向)以供指定使用; unless explicitly stated otherwise. 除非另有明确说明。

— All identifiers with external linkage in any of the following subclauses (including the future library directions) are always reserved for use as identifiers with external linkage. - 以下任何子条款中包含外部链接的所有标识符(包括未来的库方向)始终保留用作具有外部链接的标识符。

— Each identifier with file scope listed in any of the following subclauses (including the future library directions) is reserved for use as a macro name and as an identifier with file scope in the same name space if any of its associated headers is included. - 如果包含任何相关标头,则保留下列任何子条款(包括未来库方向)中列出的具有文件范围的每个标识符,以用作宏名称和具有相同名称空间的文件范围的标识符。

If the program declares or defines an identifier in a context in which it is reserved, or defines a reserved identifier as a macro name, the behavior is undefined. 如果程序在保留它的上下文中声明或定义标识符,或者将保留标识符定义为宏名称,则行为是未定义的。 Note that this doesn't mean you can't do that, as this post shows, it can be done within gcc and glibc. 请注意,这并不意味着你不能这样做,正如这篇文章所示,它可以在gcc和glibc中完成。

glibc §1.3.3 Reserved Names proveds a clearer reason: glibc§1.3.3保留名称证明了一个更明确的原因:

The names of all library types, macros, variables and functions that come from the ISO C standard are reserved unconditionally; 来自ISO C标准的所有库类型,宏,变量和函数的名称都是无条件保留的; your program may not redefine these names. 您的程序可能不会重新定义这些名称。 All other library names are reserved if your program explicitly includes the header file that defines or declares them. 如果您的程序明确包含定义或声明它们的头文件,则保留所有其他库名。 There are several reasons for these restrictions: 这些限制有几个原因:

Other people reading your code could get very confused if you were using a function named exit to do something completely different from what the standard exit function does, for example. 如果您使用名为exit的函数执行与标准退出函数完全不同的操作,那么阅读代码的其他人可能会非常困惑。 Preventing this situation helps to make your programs easier to understand and contributes to modularity and maintainability. 防止这种情况有助于使您的程序更易于理解,并有助于模块化和可维护性。

It avoids the possibility of a user accidentally redefining a library function that is called by other library functions. 它避免了用户意外重新定义其他库函数调用的库函数的可能性。 If redefinition were allowed, those other functions would not work properly. 如果允许重新定义,那些其他功能将无法正常工作。

It allows the compiler to do whatever special optimizations it pleases on calls to these functions, without the possibility that they may have been redefined by the user. 它允许编译器在调用这些函数时进行任何特殊的优化,而不会被用户重新定义。 Some library facilities, such as those for dealing with variadic arguments (see Variadic Functions) and non-local exits (see Non-Local Exits), actually require a considerable amount of cooperation on the part of the C compiler, and with respect to the implementation, it might be easier for the compiler to treat these as built-in parts of the language. 一些库设施,例如用于处理可变参数(参见Variadic函数)和非本地出口(参见非本地退出)的设施,实际上需要C编译器方面的大量合作,并且相对于实现,编译器可能更容易将它们视为语言的内置部分。

That's almost certainly because you are passing in a destination that is a "string literal". 这几乎肯定是因为你传递的是一个“字符串文字”的目的地。

char *s = strcpy("a", "b"); char * s = strcpy(“a”,“b”);

Along with the compiler knowing "I can do strcpy inline", so your function never gets called. 随着编译器知道“我可以执行strcpy inline”,所以你的函数永远不会被调用。

You are trying to copy "b" over the string literal "a" , and that won't work. 您正试图在字符串文字"a"上复制"b" "a" ,但这不起作用。

Make a char a[2]; 做一个char a[2]; and strcpy(a, "b"); strcpy(a, "b"); and it will run - it probably won't call your strcpy function, because the compiler inlines small strcpy even if you don't have optimisation available. 它会运行 - 它可能不会调用你的strcpy函数,因为即使你没有可用的优化,编译器也会内联小的strcpy

Putting the matter of trying to modify non-modifiable memory aside, keep in mind that you are formally not allowed to redefine standard library functions. 将试图修改不可修改的内存放在一边,请记住,正式不允许重新定义标准库函数。

However, in some implementations you might notice that providing another definition for standard library function does not trigger the usual "multiple definition" error. 但是,在某些实现中,您可能会注意到为标准库函数提供另一个定义不会触发通常的“多重定义”错误。 This happens because in such implementations standard library functions are defined as so called "weak symbols". 发生这种情况是因为在这种实现中,标准库函数被定义为所谓的“弱符号”。 Foe example, GCC standard library is known for that. 例如,GCC标准库就是众所周知的。

The direct consequence of that is that when you define your own "version" of standard library function with external linkage, your definition overrides the "weak" standard definition for the entire program. 这样做的直接后果是,当您使用外部链接定义自己的标准库函数“版本”时,您的定义将覆盖整个程序的“弱”标准定义。 You will notice that not only your code now calls your version of the function, but also all class from all pre-compiled [third-party] libraries are also dispatched to your definition. 您会注意到,不仅您的代码现在调用您的函数版本,而且所有预编译的[第三方]库中的所有类也都会调度到您的定义中。 It is intended as a feature, but you have to be aware of it to avoid "using" this feature inadvertently. 它旨在作为一项功能,但您必须注意它,以避免无意中“使用”此功能。

You can read about it here, for one example 举个例子,你可以在这里阅读它

How to replace C standard library function ? 如何替换C标准库函数?

This feature of the implementation doesn't violate the language specification, since it operates within uncharted area of undefined behavior not governed by any standard requirements. 该实现的这一特性不违反语言规范,因为它在未定义行为的未知区域内运行,不受任何标准要求的约束。

Of course, the calls that use intrinsic/inline implementation of some standard library function will not be affected by the redefinition. 当然,使用某些标准库函数的内在/内联实现的调用不会受到重新定义的影响。

Your question is misleading. 你的问题很容易引起误解。

The problem that you see has nothing to do with the re-implementation of a library function. 您看到的问题与重新实现库函数无关。

You are just trying to write non-writable memory, that is the memory where the string literal a exists. 您只是尝试编写不可写内存,即存在字符串文字a内存。

To put it simple, the following program gives a segmentation fault on my machine (compiled with gcc 4.7.3 , no flags): 简单来说,下面的程序在我的机器上给出了一个分段错误(用gcc 4.7.3编译,没有标志):

#include <string.h>

int main(int argc, const char *argv[])
{
    strcpy("a", "b");
    return 0;
}

But then, why the segmentation fault if you are calling a version of strcpy (yours) that doesn't write the non-writable memory? 但是,如果您调用的是不写不可写内存的strcpy (您的)版本,为什么分段会strcpy Simply because your function is not being called. 仅仅因为你的功能没有被调用。

If you compile your code with the -S flag and have a look at the assembly code that the compiler generates for it, there will be no call to strcpy (because the compiler has "inlined" that call, the only relevant call that you can see from main, is a call to puts ). 如果使用-S标志编译代码并查看编译器为其生成的汇编代码,则不会call strcpy (因为编译器已“内联”该调用,这是您可以进行的唯一相关调用从主要看,是对puts的调用)。

.file   "test.c"
    .section    .rodata
.LC0:
    .string "a"
    .align 8
.LC1:
    .string "\nThe function ran successfully"
    .text
    .globl  main
    .type   main, @function
main:
.LFB2:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $16, %rsp
    movw    $98, .LC0(%rip)
    movq    $.LC0, -8(%rbp)
    movl    $.LC1, %edi
    call    puts
    movl    $0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE2:
    .size   main, .-main
    .section    .rodata
.LC2:
    .string "in duplicate function strcpy"
    .text
    .globl  strcpy
    .type   strcpy, @function
strcpy:
.LFB3:
    .cfi_startproc
    pushq   %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset 6, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register 6
    subq    $16, %rsp
    movq    %rdi, -8(%rbp)
    movq    %rsi, -16(%rbp)
    movl    $.LC2, %edi
    movl    $0, %eax
    call    printf
    movl    $.LC0, %eax
    leave
    .cfi_def_cfa 7, 8
    ret
    .cfi_endproc
.LFE3:
    .size   strcpy, .-strcpy
    .ident  "GCC: (Ubuntu/Linaro 4.7.3-1ubuntu1) 4.7.3"
    .

I think Yu Hao answer has a great explanation for this, the quote from the standard: 我觉得余浩的答案对此有很好的解释,标准引用:

The names of all library types, macros, variables and functions that come from the ISO C standard are reserved unconditionally; 来自ISO C标准的所有库类型,宏,变量和函数的名称都是无条件保留的; your program may not redefine these names. 您的程序可能不会重新定义这些名称。 All other library names are reserved if your program explicitly includes the header file that defines or declares them. 如果您的程序明确包含定义或声明它们的头文件,则保留所有其他库名。 There are several reasons for these restrictions: 这些限制有几个原因:

[...] [...]

It allows the compiler to do whatever special optimizations it pleases on calls to these functions, without the possibility that they may have been redefined by the user. 它允许编译器在调用这些函数时进行任何特殊的优化,而不会被用户重新定义。

your example can operate in this way : ( with strdup ) 你的例子可以这样运作:( 使用strdup

char *strcpy(char *destination, const char *source);

int main(){
    char *s = strcpy(strdup("a"), strdup("b"));
    printf("\nThe function ran successfully\n");
    return 0;
}

char *strcpy(char *destination, const char *source){
    printf("in duplicate function strcpy");
    return strdup("a");
}

output : 输出:

  in duplicate function strcpy
  The function ran successfully

The way to interpret this rule is that you cannot have multiple definitions of a function end up in the final linked object (the executable). 解释此规则的方法是,您不能在最终链接对象(可执行文件)中有多个函数定义。 So, if all the objects included in the link have only one definition of a function, then you are good. 因此,如果链接中包含的所有对象只有一个函数的定义,那么你就是好的。 Keeping this in mind, consider the following scenarios. 记住这一点,请考虑以下方案。

  1. Let's say you redefine a function somefunction() that is defined in some library. 假设您重新定义了某个库中定义的函数somefunction()。 Your function is in main.c (main.o) and in the library the function is in an a object named someobject.o (in the libray). 你的函数在main.c(main.o)中,在函数库中,函数在一个名为someobject.o的对象中(在libray中)。 Remember that in the final link, the linker only looks for unresolved symbols in the libraries. 请记住,在最后一个链接中,链接器仅查找库中未解析的符号。 Because somefunction() is resolved already from main.o, the linker does not even look for it in the libraries and does not pull in someobject.o. 因为somefunction()已经从main.o解析,所以链接器甚至不在库中查找它,也不会引入someobject.o。 The final link has only one definition of the function, and things are fine. 最后一个链接只有一个函数的定义,事情很好。
  2. Now imagine that there is another symbol anotherfunction() defined in someobject.o that you also happen to call. 现在假设在someobject.o中定义了另一个符号anotherfunction(),你也恰好调用它。 The linker will try to resolve anotherfunction() from someobject.o, and pull it in from the library, and it will become a part of the final link. 链接器将尝试从someobject.o解析anotherfunction(),并从库中将其拉入,它将成为最终链接的一部分。 Now you have two definitions of somefunction() in the final link - one from main.o and another from someobject.o, and the linker will throw an error. 现在,在最终链接中有两个somefunction()定义 - 一个来自main.o,另一个来自someobject.o,链接器将抛出错误。

I use this one frequently: 我经常使用这个:

void my_strcpy(char *dest, char *src)
{
    int i;

    i = 0;
    while (src[i])
    {
        dest[i] = src[i];
        i++;
    }
    dest[i] = '\0';
}

and you can also do strncpy just by modify one line 你也可以通过修改一行来做strncpy

void my_strncpy(char *dest, char *src, int n)
{
    int i;

    i = 0;
    while (src[i] && i < n)
    {
        dest[i] = src[i];
        i++;
    }
    dest[i] = '\0';
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM