在C中解析HTTP请求行

Question

这是永远不会结束的问题。 任务是解析Web服务器中的请求行 - 具有不确定的长度 - 在C中。我将以下网络作为一个示例来处理。

GET /path/script.cgi?field1=value1&field2=value2 HTTP/1.1

我必须提取绝对路径： /path/script.cgi和查询： ?field1=value1&field2=value2 。 我被告知以下函数持有密钥： strchr ， strcpy ， strncmp ， strncpy和/或strstr 。

到目前为止发生了什么：我已经了解到使用strchr和strstr这样的函数绝对允许我在某些点截断请求行，但绝不允许我删除我不想要的部分请求行，并且我如何分层他们并不重要。

例如，这里有一些代码让我接近隔离查询，但我无法消除http版本。

bool parse(const char* line)
{
    // request line w/o method
    const char ch = '/';
    char* lineptr = strchr(line, ch);

    // request line w/ query and HTTP version
    char ch_1 = '?';
    char* lineptr_1 = strchr(lineptr, ch_1);

    // request line w/o query
    char ch_2 = ' ';
    char* lineptr_2 = strchr(lineptr_1, ch_2);

    printf("%s\n", lineptr_2);

    if (lineptr_2 != NULL)
        return true;
    else
        return false;
}

毋庸置疑，我有一个类似的问题试图隔离绝对路径（我可以抛弃方法，但不能删除？或其后的任何东西），我认为没有机会我可以使用需要我知道先验的函数我想从一个位置（通常是一个数组）复制多少个字符到另一个位置，因为当实时运行时，我将无法预先知道请求行的样子。 如果有人看到我失踪的东西，并指出我正确的方向，我将非常感激！

Answer 1

更优雅的解决方案。

#include <stdio.h>
#include <string.h>

int parse(const char* line)
{
    /* Find out where everything is */
    const char *start_of_path = strchr(line, ' ') + 1;
    const char *start_of_query = strchr(start_of_path, '?');
    const char *end_of_query = strchr(start_of_query, ' ');

    /* Get the right amount of memory */
    char path[start_of_query - start_of_path];
    char query[end_of_query - start_of_query];

    /* Copy the strings into our memory */
    strncpy(path, start_of_path,  start_of_query - start_of_path);
    strncpy(query, start_of_query, end_of_query - start_of_query);

    /* Null terminators (because strncpy does not provide them) */
    path[sizeof(path)] = 0;
    query[sizeof(query)] = 0;

    /*Print */
    printf("%s\n", query, sizeof(query));
    printf("%s\n", path, sizeof(path));
}

int main(void)
{
    parse("GET /path/script.cgi?field1=value1&field2=value2 HTTP/1.1");
    return 0;
}

Answer 2

我在C语言中编写了一些函数，手动将c-string解析为分隔符，类似于C ++中的getline。

// Trims all leading whitespace along with consecutive whitespace from provided cstring into destination char*. WARNING: ensure size <= sizeof(destination)
void Trim(char* destination, char* source, int size)
{
    bool trim = true;
    int index = 0;
    int i;
    for (i = 0; i < size; ++i)
    {
        if (source[i] == '\n' || source[i] == '\0')
        {
            destination[index++] = '\0';
            break;
        }
        else if (source[i] != ' ' && source[i] != '\t')
        {
            destination[index++] = source[i];
            trim = false;
        }
        else if (trim)
            continue;
        else
        {
            if (index > 0 && destination[index - 1] != ' ')
                destination[index++] = ' ';
        }
    }
}

// Parses text up to the provided delimiter (or newline) into the destination char*. WARNING: ensure size <= sizeof(destination)
void ParseUpToSymbol(char* destination, char* source, int size, char delimiter)
{
    int index = 0;
    int i;
    for (i = 0; i < size; ++i)
    {
        if (source[i] != delimiter && source[i] != '\n' && source[i] != '\0'  && source[i] != ' '))
        {
            destination[index++] = source[i];
        }
        else
        {
            destination[i] = '\0';
            break;
        }
    }

    Trim(destination, destination, size);
}

然后你可以用这些行解析你的c-string：

char* buffer = (char*)malloc(64);
char* temp = (char*)malloc(256);
strcpy(temp, "GET /path/script.cgi?field1=value1&field2=value2 HTTP/1.1");
Trim(temp, temp, 256);
ParseUpToSymbol(buffer, cstr, 64, '?');
temp = temp + strlen(buffer) + 1;
Trim(temp, temp, 256);

上面的代码修剪了目标字符串中的任何前导和尾随空格，在本例中为“GET /path/script.cgi?field1=value1&field2=value2 HTTP / 1.1”，然后将解析后的值存储到变量缓冲区中。 第一次运行它应该在缓冲区中放入“GET”一词。 当您执行“temp = temp + strlen（buffer）+ 1”时，您将重新调整临时char指针，以便您可以再次使用字符串的剩余部分调用ParseUpToSymbol。 如果你再次打电话，你应该得到通向第一个问号的绝对路径。 您可以重复此操作以获取每个单独的查询字符串或将分隔符更改为空格并获取URL的整个查询字符串部分。 我想你应该已经明白了。 当然，这只是众多解决方案中的一种。

在C中解析HTTP请求行

问题描述

2 个解决方案

解决方案1
7 已采纳 2016-12-22 15:44:52

解决方案2
2 2016-12-22 15:40:06

在C中解析HTTP请求行

问题描述

2 个解决方案

解决方案1 7 已采纳 2016-12-22 15:44:52

解决方案2 2 2016-12-22 15:40:06

解决方案1
7 已采纳 2016-12-22 15:44:52

解决方案2
2 2016-12-22 15:40:06