对字符串进行标记，将其作为char *传递给execve（）

Question

My knowledge of C is very limited. 我对C的了解非常有限。 I'm trying to tokenize a String passed to a server from a client, because I want to use passed arguments to execve . 我试图来标记传递到来自客户端的服务器字符串，因为我想用传递参数execve 。 The arguments passed via buffer needs to be copied to *argv and tokenized such that buffer 's tokens can be accessed with argv[0] , argv[1] , etc. Obviously I'm doing something incorrectly. 通过buffer传递的参数需要复制到*argv并进行标记化，以便可以使用argv[0] ， argv[1]等访问buffer的标记。显然我正在做错误的事情。

n = read(sockfd, buffer, sizeof(buffer));
strcpy(*argv, buffer);
printf("buffer:%s\n", buffer);
printf("argv:%s\n", *argv);
printf("argv[0]:%s\n", argv[0]);
printf("argv[1]:%s\n", argv[1]);
*argv = strtok_r(*argv, " ", argv);
printf("argv:%s\n", *argv);

i = fork();
if (i < 0) {
    //Close socket on fork error.
    perror("fork");
    exit(-1);
} else if (i == 0) {
    //execve on input args
    execve(argv[0], &argv[0], 0);
    exit(0);
} else {
    wait(&status);
    //close(sockfd);
}

Passing the arguments "/bin/date -u" with the above code gives an output of: 使用上面的代码传递参数“/ bin / date -u”会得到以下输出：

buffer:/bin/date -u

argv:/bin/date -u

argv[0]:/bin/date -u

argv[1]:(null)

What I what is an output of: 我的输出是什么：

buffer:/bin/date -u

argv:/bin/date -u

argv[0]:/bin/date

argv[1]:-u

I tried using strtok_r() , but it didn't work as I intended. 我尝试使用strtok_r() ，但它没有像我预期的那样工作。 The snippet I inserted was: 我插入的片段是：

*argv = strtok_r(*argv, " ", argv);
printf("argv:%s\n", *argv);

which give an output of argv:/bin/date . 它给出了argv:/bin/date的输出。

Thanks in advanced, SO. 谢谢先进，SO。

Edit: I don't have to explicitly tokenize buffer like I have above. 编辑：我没有像上面那样明确地标记化buffer 。 Any way to get arguments from the client passed to the server works fine. 从客户端获取参数传递到服务器的任何方法都可以正常工作。

Answer 1

Well, there are several issues you are dealing with. 那么，你正在处理几个问题。 The first being the choice of argv as the varable you are writing buffer to. 第一个是选择argv作为你正在编写缓冲区的varable。 While it is just an array of pointers, you generally consider argv as the array holding the arguments passed to the instant process, not as a variable to modify. 虽然它只是一个指针数组，但您通常将argv视为包含传递给即时进程的参数的数组，而不是作为要修改的变量。 However, that is really semantics, there is no prohibition from doing it that I know of. 然而，这确实是语义学，我知道没有禁止这样做。 However, you cannot tokenize *argv while at the same time assigning the tokens to *argv because strtok_r modifies *argv during the process. 但是，你不能标记化*argv ，而在同一时间分配令牌*argv ，因为strtok_r修改*argv过程中。

Beyond that, the real issue appears to be the use of strtok_r . 除此之外，真正的问题似乎是使用strtok_r 。 Take a look at man strtok_r . 看看man strtok_r 。 In order to tokenize a string, you need to make repeated calls to strtok_r in order to extract all tokens. 为了对字符串进行标记，您需要重复调用strtok_r以提取所有标记。 The first call to strtok_r using the first argument (*argv...) merely extracts the first token. 使用第一个参数（* argv ...）对strtok_r的第一次调用仅提取第一个标记。 In order to complete the extraction, you must pass NULL as the first argument until all tokens have been extacted. 为了完成提取，必须将NULL作为第一个参数传递，直到所有标记都被提取为止。 Additionally, the string you are extracting tokens from is modified by calls to strtok_r and should not be used following extraction. 此外，您从中提取标记的字符串将通过调用strtok_r修改，并且不应在提取后使用。 Generally a copy of the string is made to preserve the original if it will be needed later. 通常，如果稍后将需要该字符串的副本以保留原始字符串。

In your code you call strtok_r only once Eg: 在你的代码中你只调用strtok_r一次Eg：

*argv = strtok_r(*argv, " ", argv);  // extracts the first token and modifies *argv

If your intent is to extract all strings, then you will need to make repeated calls to strtok_r something like: 如果你的意图是提取所有字符串，那么你需要重复调用strtok_r ，如：

char *token = malloc (sizeof (token) * 128); // or something large enough to hold the tokens

token = strtok_r(*argv, " ", argv);
if (token)
    printf (" token: %s\n", token);

while ((token = strtok_r (NULL, " ", argv)) != NULL)
{
    printf (" token: %s\n", token);
}

You can capture the individual tokens in however you like in order to pass them to execve . 您可以随意捕获各个令牌，以便将它们传递给execve 。 However, you are not going to be able to strip tokes out of argv while at the same time writing back to argv . 但是，你是不是要能够剥离托克斯（Tokes）出argv ，而在同一时间写回argv 。 As indicated above, argv is modified by strtok_r during extraction, so you will need a separate array to hold the tokens. 如上所述， argv在提取过程中由strtok_r修改，因此您需要一个单独的数组来保存标记。 Hope this helps. 希望这可以帮助。

Answer 2

The strtok() and strtok_r() functions return one token at a time. strtok()和strtok_r()函数一次返回一个标记。 They maintain state between calls and you need to call them in a loop to split a string into tokens. 它们在调用之间保持状态，您需要在循环中调用它们以将字符串拆分为标记。 Also they modify the buffer passed as the first argument in-place, so you need to copy it. 他们还修改了作为第一个参数传入的缓冲区，因此您需要复制它。

Let me show you an example: 让我举个例子：

#include <stdio.h>
#include <string.h>

#define MAX_CMD_SIZE 1024
#define MAX_ARG_COUNT 10

main()
{
    const char *command = "/bin/test arg1 arg2 arg3 arg4 arg5";

    /* Allocate a buffer for tokenization.
     * the strtok_r() function modifies this buffer in-place and return pointers
     * to strings located inside this buffer. */
    char cmd_buf[MAX_CMD_SIZE] = { 0 };
    strncpy(cmd_buf, command, sizeof(cmd_buf));

    /* This strtok_r() call puts '\0' after the first token in the buffer,
     * It saves the state to the strtok_state and subsequent calls resume from that point. */
    char *strtok_state = NULL;
    char *filename = strtok_r(cmd_buf, " ", &strtok_state);
    printf("filename = %s\n", filename);

    /* Allocate an array of pointers.
     * We will make them point to certain locations inside the cmd_buf. */
    char *args[MAX_ARG_COUNT] = { NULL };

    /* loop the strtok_r() call while there are tokens and free space in the array */
    size_t current_arg_idx;
    for (current_arg_idx = 0; current_arg_idx < MAX_ARG_COUNT; ++current_arg_idx) {
        /* Note that the first argument to strtok_r() is NULL.
         * That means resume from a point saved in the strtok_state. */
        char *current_arg = strtok_r(NULL, " ", &strtok_state);
        if (current_arg == NULL) {
            break;
        }

        args[current_arg_idx] = current_arg;
        printf("args[%d] = %s\n", current_arg_idx, args[current_arg_idx]);
    }
}

The output of the example above is: 上面例子的输出是：

filename = /bin/test
args[0] = arg1
args[1] = arg2
args[2] = arg3
args[3] = arg4
args[4] = arg5

Note that I put filename and args into separate variables to illustrate a difference between the first call and the subsequent calls. 请注意，我将filename和args放入单独的变量中，以说明第一个调用和后续调用之间的区别。 For execve() you normally want to put them into a single array and call it like execve(argv[0], argv, NULL); 对于execve()您通常希望将它们放入单个数组中，并将其称为execve(argv[0], argv, NULL); because the filename is supposed to be the first element in argv . 因为文件名应该是argv的第一个元素。

对字符串进行标记，将其作为char *传递给execve（）

问题描述

2 个解决方案

解决方案1
3 已采纳 2014-07-17 07:46:12

解决方案2
0 2014-07-17 08:03:55

对字符串进行标记，将其作为char *传递给execve（）

问题描述

2 个解决方案

解决方案1 3 已采纳 2014-07-17 07:46:12

解决方案2 0 2014-07-17 08:03:55

解决方案1
3 已采纳 2014-07-17 07:46:12

解决方案2
0 2014-07-17 08:03:55