简体   繁体   English

为什么使用 istream 迭代器会出现段错误?

[英]Why am I getting seg faults from using the istream iterator?

void parse_and_run_command(const std::string &command) {
    std::istringstream iss(command);
    std::istream_iterator<char*> begin(iss), end;
    std::vector<char*> tokens(begin, end); //place the arguments in a vector
    tokens.push_back(NULL); 

According to GDB, the segfault occurs after executing the second line with the istream_iterator.根据 GDB 的说法,段错误发生在使用 istream_iterator 执行第二行之后。 It did not segfault earlier when I was using string vectors.当我使用字符串向量时,它没有早些时候出现段错误。

You first need to create a std::vector of std::string which will own the string data, you can then transform that std::vector into a std::vector of pointers, note that the pointers will only be valid for the lifetime of the std::string std::vector :您首先需要创建一个std::vector std::stringstd::vector ,它将拥有字符串数据,然后您可以将该std::vector转换为指针的std::vector ,请注意,这些指针仅对std::string std::vector生命周期:

#include <string>
#include <iostream>
#include <sstream>
#include <iterator>
#include <vector>
#include <algorithm>

void parse_and_run_command(const std::string &command) {
    std::istringstream iss(command);
    std::istream_iterator<std::string> begin(iss), end;
    std::vector<std::string> tokens(begin, end);
    std::vector<char*> ctokens;
    std::transform(tokens.begin(), tokens.end(), std::back_inserter(ctokens), [](std::string& s) { return s.data(); });
    ctokens.push_back(nullptr);
    for (char* s : ctokens) {
        if (s) {
            std::cout << s << "\n";
        }
        else {
            std::cout << "nullptr\n";
        }
    }
}

int main() {
    parse_and_run_command("test test2 test3");
}

First, you need to split the std::string command into list of tokens of type std::vector<std::string> .首先,您需要将std::string命令拆分为std::vector<std::string>类型的标记列表。 Then, you may want to use std::transform in order to fill the new list of tokens of type std::vector<char const*> .然后,您可能想要使用std::transform来填充std::vector<char const*>类型的新标记列表。

Here is a sample code:这是一个示例代码:

void parse_and_run_command(std::string const& command) {
    std::istringstream iss(command);
    std::vector<std::string> results(std::istream_iterator<std::string>{iss},
                                     std::istream_iterator<std::string>());

    // debugging
    for (auto const& token : results) {
        std::cout << token << " ";
    }

    std::cout << std::endl;

    std::vector<const char*> pointer_results;
    pointer_results.resize(results.size(), nullptr);
    std::transform(
        std::begin(results), std::end(results),
        std::begin(pointer_results),
        [&results](std::string const& str) {
            return str.c_str();
        }
    );

    // debugging
    for (auto const& token : pointer_results) {
        std::cout << token << " ";
    }

    std::cout << std::endl;

    // execv expects NULL as last element
    pointer_results.push_back(nullptr);

    char **cmd = const_cast<char**>(pointer_results.data());
    execv(cmd[0], &cmd[0]);
}

Note the last part of the function: execv expects last element to be nullptr .注意函数的最后一部分: execv期望最后一个元素是nullptr

Hm, very interesting.嗯,很有趣。 Sounds like an easy task, but there are several caveats.听起来像是一项简单的任务,但有几个注意事项。

First of all, we need to consider that there are at least 2 different implementations of execv .首先,我们需要考虑execv至少有两种不同的实现。

One under Posix / Linux, see here and a windows version: see here and here . Posix / Linux 下的一个,请参阅此处和 Windows 版本:请参阅此处此处

Please note the different function signatures:请注意不同的函数签名:

Linux / POSIX:   int execv(const char *path, char *const argv[]);
Windows:         intptr_t _execv(const char *cmdname, const char *const *argv);

In this case I find the WIndows version a little bit cleaner, because the argv parameter is of type const char *const * .在这种情况下,我发现 WIndows 版本更简洁一些,因为 argv 参数的类型是const char *const * Anyway, the major problem is, that we have to call legacy code.无论如何,主要问题是,我们必须调用遗留代码。

Ok, let's see.好吧,让我们来看看。

The execv function requires a NULL-terminated array of char pointers with the argument for the function call. execv函数需要一个以 NULL 结尾的字符指针数组,其中包含函数调用的参数。 This we need to create.这是我们需要创建的。

We start with a std::string containing the command.我们从包含命令的std::string开始。 This needs to be split up into parts.这需要分成几部分。 There are several ways and I added different examples.有几种方法,我添加了不同的示例。

The most simple way is maybe to put the std::string into a std::istringstream and then to use the std::istream_iterator to split it into parts.最简单的方法可能是将std::string放入std::istringstream ,然后使用std::istream_iterator将其拆分为多个部分。 This is the typical short sequence:这是典型的短序列:

// Put this into istringstream 
std::istringstream iss(command);
// Split
std::vector parts(std::istream_iterator<std::string>(iss), {});

We use the range constructor for the std::vector .我们对std::vector使用范围构造函数。 And we can define the std::vector without template argument.我们可以在没有模板参数的情况下定义std::vector The compiler can deduce the argument from the given function parameters.编译器可以从给定的函数参数中推导出参数。 This feature is called CTAD ("class template argument deduction").此功能称为 CTAD(“类模板参数推导”)。

Additionally, you can see that I do not use the "end()"-iterator explicitely.此外,您可以看到我没有明确使用“end()”-迭代器。

This iterator will be constructed from the empty brace-enclosed default initializer with the correct type, because it will be deduced to be the same as the type of the first argument due to the std::vector constructor requiring that.该迭代器将从具有正确类型的空大括号括起来的默认初始化器构造,因为由于 std::vector 构造函数要求,它将被推导出与第一个参数的类型相同。

We can avoid the usage of std::istringstream and directly convert the string into tokens using std::sregex_token_iterator .我们可以避免使用std::istringstream而直接使用std::sregex_token_iterator将字符串转换为标记。 Very simple to use.使用起来非常简单。 And the result is a one liner for splitting the original comand string:结果是一个用于拆分原始命令字符串的单行:

// Split
std::vector<std::string> parts(std::sregex_token_iterator(command.begin(), command.end(), re, -1), {});

All this then boils down to 6 lines of code, including the definition of the variable and the invocation of the execv function:所有这些都归结为 6 行代码,包括变量的定义和execv函数的调用:

Please see:请参见:

#include <iostream>
#include <string>
#include <sstream>
#include <vector>
#include <iterator>
#include <memory>
#include <algorithm>
#include <regex>

const std::regex re{ " " };

// Define Dummy function for _execv  (Windows style, eveything const)
// Note: Type of argv decays to " const char* const* "
int _execv(const char* path, const char* const argv[]) {
    std::cout << "\n\nPath: " << path << "\n\nArguments:\n\n";
    while (*argv != 0) std::cout << *argv++ << "\n";
    return 0;
}

// Define Dummy function for _execv  (Posix style)
// Note: Type of argv decays to " char* const* "
int execv(const char* path, char* const argv[]) {
    std::cout << "\n\nPath: " << path << "\n\nArguments:\n\n";
    while (*argv != 0) std::cout << *argv++ << "\n";
    return 0;
}


int main() {
    {
        // ----------------------------------------------------------------------
        // Solution 1
        // Initial example
        char path[] = "path";
        const char* const argv[] = { "arg1", "arg2", "arg3", 0 };
        _execv(path, argv);
    }


    {
        // ----------------------------------------------------------------------
        // Solution 2
        // Now, string, with command convert to a handmade argv array
        std::string command{ "path arg1 arg2 arg3" };

        // Put this into istringstream 
        std::istringstream iss(command);

        // Split into substrings
        std::vector parts(std::istream_iterator<std::string>(iss), {});

        // create "argv" List. argv is of type " const char* "
        std::unique_ptr<const char*[]> argv = std::make_unique<const char*[]>(parts.size());

        // Fill argv array
        size_t i = 1U;
        for (; i < parts.size(); ++i) {
            argv[i - 1] = parts[i].c_str();
        }
        argv[i - 1] = static_cast<char*>(0);

        // Call execv
        // Windows
        _execv(parts[0].c_str(), argv.get());

        // Linux / Posix
        execv(parts[0].c_str(), const_cast<char* const*>(argv.get()));
    }

    {
        // ----------------------------------------------------------------------
        // Solution 3
        // Transform string vector to vector of char*
        std::string command{ "path arg1 arg2 arg3" };

        // Put this into istringstream 
        std::istringstream iss(command);
        // Split
        std::vector parts(std::istream_iterator<std::string>(iss), {});

        // Fill argv
        std::vector<const char*> argv{};
        std::transform(parts.begin(), parts.end(), std::back_inserter(argv), [](const std::string& s) { return s.c_str(); });
        argv.push_back(static_cast<const char*>(0));

        // Call execv
        // Windows
        _execv(argv[0], &argv[1]);

        // Linux / Posix
        execv(argv[0], const_cast<char* const*>(&argv[1]));
    }
    {
        // ----------------------------------------------------------------------
        // Solution 4
        // Transform string vector to vector of char*. Get rid of istringstream
        std::string command{ "path arg1 arg2 arg3" };

        // Split
        std::vector<std::string> parts(std::sregex_token_iterator(command.begin(), command.end(), re, -1), {});

        // Fill argv
        std::vector<const char*> argv{};
        std::transform(parts.begin(), parts.end(), std::back_inserter(argv), [](const std::string& s) { return s.c_str(); });
        argv.push_back(static_cast<const char*>(0));

        // Call execv
        // Windows
        _execv(argv[0], &argv[1]);

        // Linux / Posix
        execv(argv[0], const_cast<char* const*>(&argv[1]));
    }

    return 0;
}

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM