简体   繁体   English

使用 C 中的指针将大写转换为小写

[英]Converting Upper Case to Lower Case using pointers in C

I've been trying to change upper case letters to lower case letter using pointers but I keep getting segmentation faults.我一直在尝试使用指针将大写字母更改为小写字母,但我不断遇到分段错误。 Here is my source code:这是我的源代码:

#include <stdlib.h>
#include <string.h>
char *changeL(char *s);
char *changeL(char *s)
{
    char *upper = s;

    for (int i = 0; upper[i] != '\0'; i++)
    {
       if (upper[i] >= 'A' && upper[i] <= 'Z')
        {
           upper[i] += 32;
        }
     }
   printf("%s\n", upper);
   return upper;
}



int main()
{
    char *first;
    char *second;
    first = "HELLO My Name is LoL";
    printf("%s\n", first);
    second = changeL(first);
    printf("There is no error here\n\n");
    printf("%s\n", second);



    return 0;
 }

Using gdb I found the seg fault to be in "upper[i] += 32;".使用 gdb 我发现 seg 错误出现在“upper[i] += 32;”中。 I don't understand why the seg fault is there.我不明白为什么会出现段错误。

"HELLO My Name is LoL" is the constant memory. “HELLO My Name is LoL”是永恒的记忆。 You can`t change it.你不能改变它。 However you pass pointer to this memory(first) to a function which tries to change it.但是,您将指向此内存的指针(首先)传递给试图更改它的函数。 Thus you got segmentation fault.因此,您遇到了分段错误。 You should copy this string to memory butffer.您应该将此字符串复制到内存缓冲区。 Like喜欢

char buffer[] = "HELLO My Name is LoL";

and then pass buffer to changeL然后将缓冲区传递给changeL

A couple of notes in addition to what @Alex correctly points out in his answer.除了@Alex在他的回答中正确指出的内容之外,还有一些注意事项。 First第一的

char *changeL(char *s);
char *changeL(char *s)
{
   ....
}

There is no need for a prototype before the function if the function is one line below.如果函数在下面一行,则函数之前不需要原型。 A prototype is used to inform code below it that the function described by the prototype exists and is defined elsewhere.原型用于通知它下面的代码原型所描述的功能存在并且在别处定义。 If you define the function immediately below the prototype it makes the prototype irrelevant.如果您在原型的正下方定义函数,它会使原型变得无关紧要。

Second as noted in Alex's answer, on a overwhelming majority of systems, a String Literal , eg the "Something Here" in char *s = "Something Here";第二,正如亚历克斯的回答中所指出的,在绝大多数系统上,字符串文字,例如char *s = "Something Here"; "Something Here"的 "Something Here" 。 is immutable and resides in read-only memory and any attempt to modify the string literal generally results in a SegFault.是不可变的并且驻留在只读内存中,任何修改字符串文字的尝试通常都会导致 SegFault。

Instead you need to create an array of characters which can be modified, eg相反,您需要创建一个可以修改的字符数组,例如

char first[] = "HELLO My Name is LoL";

or with C99+ you can use a Compound Literal to initialize first as a pointer to an array of char , eg或者使用 C99+,您可以使用复合文字first初始化为指向char数组的指针,例如

char *first = (char[]){ "HELLO My Name is LoL" };

In both cases above the characters pointed to by first will be modifiable.在上述两种情况下, first指向的字符都是可以修改的。

Addition Per Comment每条评论添加

"can you also explain to him why is he getting segfault at upper[i] += 32;"

Yes.是的。 At mentioned above, when you initialize a pointer to a String Literal on virtually every current system (ancient systems had no distinction or protection for read-only memory -- all memory was writable).如上所述,当您在几乎每个当前系统上初始化指向字符串文字的指针时(古代系统对只读内存没有区别或保护——所有内存都是可写的)。 In the current day, creating a string literal (eg "foo" ) creates the string in memory which cannot be modified.在当前,创建字符串文字(例如"foo" )会在内存中创建无法修改的字符串。 (for ELF executables, that is generally in the .rodata section of the executable -- dissecting closer ".ro...data" meaning "read-only data" ) (对于 ELF 可执行文件,通常位于可执行文件的.rodata部分中——仔细剖析".ro...data"意思是"read-only data"

When any attempt is made to change data that cannot be modified, a Segmentation Fault generally results because you have attempted to write to an address within a segment that is read-only.当尝试更改无法修改的数据时,通常会导致分段错误,因为您试图写入只读段内的地址。 (thus the Segmentation Fault -- of SegFault) (因此是 SegFault 的分段错误)

In the code above as originally written with在上面的代码中,最初是用

first = "HELLO My Name is LoL";

If you compile to assembly (on Linux, eg gcc -S -masm=intel -o mysaved.asm myfile.c you will see that the string "HELLO My Name is LoL" is in fact created in the .rodata section. You do not have any ability to change that data -- you now know what happens when you try :)如果你编译成程序集(在 Linux 上,例如gcc -S -masm=intel -o mysaved.asm myfile.c你会看到字符串"HELLO My Name is LoL"实际上是在.rodata部分中创建的。你这样做无法更改该数据-您现在知道尝试时会发生什么:)

The code as written in the Question also shows confusion about what the pointers first and second actually point to.问题中编写的代码还显示了指针firstsecond实际指向的内容的混淆。 By assigning the return of changeL to second , there is no new memory created for second .通过将changeL的返回值分配给second ,不会为second创建新的内存。 It is no different than simply assigning second = first;这与简单地分配second = first; in main() .main()中。 second is just a separate pointer that points to the same memory referenced by first . second只是一个单独的指针,它指向first引用的同一内存。 A more concise version of the code would be:更简洁的代码版本是:

#include <stdio.h>

void changeL (char *s)
{
    for (int i = 0; s[i]; i++)
        if (s[i] >= 'A' && s[i] <= 'Z')
            s[i] += 32;
}

int main (void)
{
    char first[] = "HELLO My Name is LoL";
    char *second = first;

    printf("%s\n", first);
    changeL(first);
    printf("%s\n", second);

    return 0;
}

( note: both header files in the original code are unnecessary, <stdio.h> is the only required header) 注意:原代码中的两个头文件都是不必要的, <stdio.h>是唯一需要的头文件)

To illustrate second simply points to first :为了说明second简单地指向first

Example Use/Output示例使用/输出

$./bin/chars
HELLO My Name is LoL
hello my name is lol

This code outputs inly the lower case in string此代码仅输出字符串中的小写字母

#include<stdio.h>

#include<string.h>

int main()

{
    char b[50];

    printf("String=");

    scanf("%[a-z]",b);

    printf("%s",b);

    return 0;

}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM