简体   繁体   English

C++ 分词字符串

[英]C++ Tokenize String

I'm looking for a simple way to tokenize string input without using non default libraries such as Boost, etc.我正在寻找一种简单的方法来标记字符串输入,而不使用非默认库(如 Boost 等)。

For example, if the user enters forty_five, I would like to seperate forty and five using the _ as the delimiter.例如,如果用户输入 forty_five,我想使用 _ 作为分隔符来分隔四十和五。

To convert a string to a vector of tokens (thread safe):将字符串转换为标记向量(线程安全):

std::vector<std::string> inline StringSplit(const std::string &source, const char *delimiter = " ", bool keepEmpty = false)
{
    std::vector<std::string> results;

    size_t prev = 0;
    size_t next = 0;

    while ((next = source.find_first_of(delimiter, prev)) != std::string::npos)
    {
        if (keepEmpty || (next - prev != 0))
        {
            results.push_back(source.substr(prev, next - prev));
        }
        prev = next + 1;
    }

    if (prev < source.size())
    {
        results.push_back(source.substr(prev));
    }

    return results;
}

You can use the strtok_r function, but read the man pages carefully so you understand how it maintains state.您可以使用strtok_r function,但请仔细阅读手册页以了解它如何维护 state。

Look at this tutorial, which is by far the best tutorial on tokenization that I have found so far.看看这个教程,这是迄今为止我发现的最好的标记化教程。 It covers the best practices in the implementation of different methods that include using getline() and find_first_of() in C++ std, and strtok() in C.它涵盖了实施不同方法的最佳实践,包括在 C++ std 中使用getline()find_first_of() ,以及在 C 中使用strtok()

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM