简体   繁体   English

如何使函数在C中返回指向新字符串的指针?

[英]How do I make a function return a pointer to a new string in C?

I'm reading K&R and I'm almost through the chapter on pointers. 我正在阅读K&R,几乎已经读完了有关指针的一章。 I'm not entirely sure if I'm going about using them the right way. 我不确定是否要正确使用它们。 I decided to try implementing itoa(n) using pointers. 我决定尝试使用指针实现itoa(n)。 Is there something glaringly wrong about the way I went about doing it? 我做这件事的方式有什么明显的错误吗? I don't particularly like that I needed to set aside a large array to work as a string buffer in order to do anything, but then again, I'm not sure if that's actually the correct way to go about it in C. 我不特别喜欢我需要预留一个大数组作为字符串缓冲区来执行任何操作,但是再说一遍,我不确定这是否真的是在C语言中进行处理的正确方法。

Are there any general guidelines you like to follow when deciding to use pointers in your code? 在决定在代码中使用指针时,您是否需要遵循任何一般性准则? Is there anything I can improve on in the code below? 我在下面的代码中有什么需要改进的地方吗? Is there a way I can work with strings without a static string buffer? 有没有没有静态字符串缓冲区的字符串处理方法?

/*Source file: String Functions*/
#include <stdio.h>

static char stringBuffer[500];
static char *strPtr = stringBuffer;

/* Algorithm: n % 10^(n+1) / 10^(n) */
char *intToString(int n){
    int p = 1;
    int i = 0;

    while(n/p != 0)
        p*=10, i++;

    for(;p != 1; p/=10)
       *(strPtr++) = ((n % p)/(p/10)) + '0';  
    *strPtr++ = '\0';

    return strPtr - i - 1;
}

int main(){
    char *s[3] = {intToString(123), intToString(456), intToString(78910)};
    printf("%s\n",s[2]);
    int x = stringToInteger(s[2]);

    printf("%d\n", x);

    return 0;
}

Lastly, can someone clarify for me what the difference between an array and a pointer is? 最后,有人可以为我澄清一下数组和指针之间的区别是什么吗? There's a section in K&R that has me very confused about it; K&R中有一个部分令我感到困惑。 "5.5 - Character Pointers and Functions." “ 5.5-字符指针和功能。” I'll quote it here: 我在这里引用它:

"There is an important difference between the definitions: “定义之间有重要区别:

 char amessage[] = "now is the time"; /*an array*/ char *pmessage = "now is the time"; /*a pointer*/ 

amessage is an array, just big enough to hold the sequence of characters and '\\0' that initializes it. amessage是一个数组,大小足以容纳字符序列和用于初始化它的'\\ 0'。 Individual characters within the array may be changed but amessage will always refer to the same storage. 数组中的各个字符可以更改,但是消息将始终引用相同的存储。 On the other hand, pmessage is a pointer, initialized to point to a string constant; 另一方面,pmessage是一个指针,已初始化为指向字符串常量。 the pointer may subsequently be modified to point elsewhere, but the result is undefined if you try to modify the string contents." 指针随后可能会修改为指向其他位置,但是如果您尝试修改字符串内容,则结果是不确定的。”

What does that even mean? 那有什么意思?

Regarding your last question: 关于您的最后一个问题:

char amessage[] = "now is the time"; - is an array. -是一个数组。 Arrays cannot be reassigned to point to something else (unlike pointers), it points to a fixed address in memory. 无法将数组重新分配为指向其他内容(与指针不同),它指向内存中的固定地址。 If the array was allocated in a block, it will be cleaned up at the end of the block (meaning you cannot return such an array from a function). 如果该数组是在一个块中分配的,则它将在该块的末尾进行清理(这意味着您无法从函数中返回此类数组)。 You can however fiddle with the data inside the array as much as you like so long as you don't exceed the size of the array. 但是,只要不超过数组的大小,就可以随意摆弄数组中的数据。

Eg this is legal amessage[0] = 'N'; 例如,这是合法amessage[0] = 'N';

char *pmessage = "now is the time"; - is a pointer. -是一个指针。 A pointer points to a block in memory, nothing more. 指针指向内存中的一个块,仅此而已。 "now is the time" is a string literal, meaning it is stored inside the executable in a read only location. "now is the time"是一个字符串文字,表示它存储在可执行文件中的只读位置。 You cannot under any circumstances modify the data it is pointing to. 您在任何情况下都不能修改其指向的数据。 You can however reassign the pointer to point to something else. 但是,您可以重新分配指针以指向其他内容。

This is NOT legal - *pmessage = 'N'; 这是不合法的- *pmessage = 'N'; - will segfault most likely (note that you can use the array syntax with pointers, *pmessage is equivalent to pmessage[0] ). -最有可能发生段错误(请注意,可以将数组语法与指针一起使用, *pmessage等效于pmessage[0] )。

If you compile it with gcc using the -S flag you can actually see "now is the time" stored in the read only part of the assembly executable. 如果使用-S标志通过gcc进行编译,则实际上可以看到"now is the time"存储在程序集可执行文件的只读部分中。

One other thing to point out is that arrays decay to pointers when passed as arguments to a function. 需要指出的另一件事是,数组在作为参数传递给函数时会衰减为指针。 The following two declarations are equivalent: 以下两个声明是等效的:

void foo(char arr[]);

and

void foo(char* arr);

Regarding your code: 关于您的代码:

You are using a single static buffer for every call to intToString: this is bad because the string produced by the first call to it will be overwritten by the next. 您对intToString的每次调用都使用单个静态缓冲区:这很糟糕,因为第一次调用它所产生的字符串将被下一次覆盖。

Generally, functions that handle strings in C should either return a new buffer from malloc , or they should write into a buffer provided by the caller. 通常,在C中处理字符串的函数应该从malloc返回一个新的缓冲区,或者应该将其写入调用方提供的缓冲区。 Allocating a new buffer is less prone to problems due to running out of buffer space. 由于缓冲区空间不足,分配新缓冲区不太容易出现问题。

You are also using a static pointer for the location to write into the buffer, and it never rewinds, so that's definitely a problem: enough calls to this function, and you will run off the end of the buffer and crash. 您还将静态指针用于写入缓冲区的位置,并且它永远不会后退,因此这绝对是一个问题:对该函数的调用足够多,您将在缓冲区末尾运行并崩溃。

You already have an initial loop that calculates the number of digits in the function. 您已经有一个初始循环来计算函数中的位数。 So you should then just make a new buffer that big using malloc , making sure to leave space for the \\0 , write in to that, and return it. 因此,您应该只使用malloc创建一个新的大缓冲区,确保为\\0留出空间,写入其中并返回。

Also, since i is not just a loop index, change it to something more obvious like length : 另外,由于i不仅是循环索引,所以将其更改为更明显的值,例如length

That is to say: get rid of the global variables, and instead after computing length : 这就是说:摆脱全局变量,而是在计算length

char *s, *result;

// compute length
s = result = malloc(length+1);
if (!s) return NULL; // out of memory

for(;p != 1; p/=10)
   *(s++) = ((n % p)/(p/10)) + '0';  
*s++ = '\0';
return result;

The caller is responsible for releasing the buffer when they're done with it. 调用者负责在缓冲区使用完毕后释放缓冲区。

Two other things I'd really recommend while learning about pointers: 在学习指针时,我真的建议另外两件事:

  • Compile with all warnings turned on ( -Wall etc) and if you get an error try to understand what caused it; 编译所有打开的警告( -Wall等),如果发现错误,请尝试了解引起错误的原因; they will have things to teach you about how you're using the language 他们会教你如何使用语言的知识

  • Run your program under Valgrind or some similar checker, which will make pointer bugs more obvious, rather than causing silent corruption 在Valgrind或类似的检查器下运行程序,这将使指针错误更明显,而不是引起无提示的损坏

For itoa the length of a resulting string can't be greater than the length of INT_MAX + minus sign - so you'd be safe with a buffer of that length. 对于itoa ,结果字符串的长度不能大于INT_MAX +负号的长度-因此,使用该长度的缓冲区将是安全的。 The length of number string is easy to determine by using log10(number) + 1 , so you'd need buffer sized log10(INT_MAX) + 3 , with space for minus and terminating \\0. 通过使用log10(number) + 1可以轻松确定数字字符串的长度,因此您需要缓冲区大小为log10(INT_MAX) + 3缓冲区,并带有减号和终止\\ 0的空间。

Also, generally it's not a good practice to return pointers to 'black box' buffers from functions. 同样,通常也不建议从函数返回指向“黑匣子”缓冲区的指针。 Your best bet here would be to provide a buffer as a pointer argument in intToString , so then you can easily use any type of memory you like (dynamic, allocated on stack, etc.). 最好的选择是在intToString提供一个缓冲区作为指针参数,这样您就可以轻松使用自己喜欢的任何类型的内存(动态的,在堆栈上分配的等等)。 Here's an example: 这是一个例子:

char *intToString(int n, char *buffer) {
    // ...        
    char *bufferStart = buffer;
    for(;p != 1; p/=10)
      *(buffer++) = ((n % p)/(p/10)) + '0';  
    *buffer++ = '\0';
    return bufferStart;
}

Then you can use it as follows: 然后,您可以按以下方式使用它:

  char *buffer1 = malloc(30);
  char buffer2[15];

  intToString(10, buffer1); // providing pointer to heap allocated memory as a buffer
  intToString(20, &buffer2[0]); // providing pointer to statically allocated memory

what the difference between an array and a pointer is? 数组和指针之间的区别是什么?

The answer is in your quote - a pointer can be modified to be pointing to another memory address. 答案在引号中-可以修改指针以指向另一个内存地址。 Compare: 相比:

int a[] = {1, 2, 3};
int b[] = {4, 5, 6};
int *ptrA = &a[0]; // the ptrA now contains pointer to a's first element
ptrA = &b[0];      // now it's b's first element
a = b;             // it won't compile

Also, arrays are generally statically allocated, while pointers are suitable for any allocation mechanism. 同样,数组通常是静态分配的,而指针适用于任何分配机制。

关于如何使用指针以及数组和指针之间的区别,建议您阅读“专家c编程”( http://www.amazon.com/Expert-Programming-Peter-van-Linden/dp/0131774298/ref= sr_1_1?ie = UTF8&qid = 1371439251&sr = 8-1&keywords = expert + c + programming )。

Better way to return strings from functions is to allocate dynamic memory (using malloc) and fill it with the required string...return this pointer to the calling function and then free it. 从函数返回字符串的更好方法是分配动态内存(使用malloc),并用所需的字符串填充它……将此指针返回到调用函数,然后释放它。

Sample code : 样例代码:

#include "stdio.h"
#include "stdlib.h"
#include "string.h"

#define MAX_NAME_SIZE 20

char * func1()
{
    char * c1= NULL;
    c1 = (char*)malloc(sizeof(MAX_NAME_SIZE));
    strcpy(c1,"John");
    return c1;
}

main()
{
    char * c2 = NULL;
    c2 = func1();
    printf("%s \n",c2);
    free(c2);
}

And this works without the static strings. 这可以在没有静态字符串的情况下工作。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM