如何在C ++中解析字符串

Question

I want to parse the strings, so that to check whether they have specified syntax or not. 我想解析字符串，以便检查它们是否具有指定的语法。

Example: 例：

Str = Z344-R565l t

Here my requirement is after Z there should be a number and after that a - and after that R should be there followed by a number, followed by l , followed by a space and then finally t . 在这里，我的要求是在Z之后应该有一个数字，然后在那之后-并且之后R应该在那里跟着一个数字，然后是l ，接着是一个空格然后最后是t 。

If any thing other than this it should be a error. 如果除此之外的任何事情应该是一个错误。

I have to parse many different kind of syntax like this. 我必须解析许多不同类型的语法。 I would be awkward if write a function for each type of syntax required. 如果为每种语法类型编写一个函数，我会很尴尬。 I heard that yacc or lex can solve this problem. 我听说yacc或lex可以解决这个问题。

Can any one please throw some light on my problem? 任何人都可以对我的问题有所了解吗？

Answer 1

You do this with a regex. 你用正则表达式做到这一点。

Z344-R565l t Z344-R565l t

Your regex should look something like this. 你的正则表达式应该是这样的。 Not sure what regex library to use for c++, but this is the general regex to make sure that your string matches. 不确定要用于c ++的正则表达式库，但这是确保字符串匹配的一般正则表达式。

Z[0-9]+-R[0-9]+l t

Answer 2

Use boost::regex 使用boost :: regex

#include <string>
#include <boost/regex.hpp>

bool isMatch(std::string input){
    boost::regex r("Z[0-9]*-R[0-9]*l t");
    return boost::regex_search(input, r);
}

The other thing that you could do is supply a list of regex expressions in a file, one expression per line. 您可以做的另一件事是在文件中提供正则表达式列表，每行一个表达式。 Create a vector of boost::regex objects using the file input and iterate through the vector of patterns for each string you need to validate. 使用文件输入创建boost :: regex对象的向量，并遍历需要验证的每个字符串的模式向量。 It's not very efficient but it will work. 它不是很有效但它会起作用。

Answer 3

Boost::Regex is fine if you just want to check the syntax. 如果您只是想检查语法，Boost :: Regex就可以了。 If you want to actually do something when you read such an expression, i suggest you use Boost::Spirit with something like : 如果你想在阅读这样的表达式时真正做某事，我建议你使用Boost :: Spirit，例如：

rule<> testFormula = 
    (ch_p('Z') >> int_p) 
    >> (ch_p('-')>>ch_p('R')>>int_p>>ch_p('l')) 
    >> space_p >> ch_p('t');

I have isolated parts of the expression which you might want to connect to some action (using [] operator). 我有一些表达式的孤立部分，你可能想要连接到某个动作（使用[]运算符）。

See the documentation for more information 有关更多信息，请参阅文档

Answer 4

If you have a recent compiler (VC 2008 SP1, etc.), there's no need to use boost, regex are part of TR1 and you can use them with this header: #include <regex> 如果您有最新的编译器（VC 2008 SP1等），则无需使用boost，regex是TR1的一部分，您可以将它们与此标头一起使用： #include <regex>

Example for a date ( you should use double \\ as escape character ): 日期示例（您应该使用double \\作为转义字符）：

string dateorder = "12/07/2009";

tr1::regex expr("^([1-2][0-9]|0?[1-9]|30|31)/([1-9]|10|11|12)/(2\\\d{3})$");

if (!regex_match(dateorder.begin(),dateorder.end(),expr))
{
    ...
    break;  
}

Answer 5

You might google " runtime parser generation " or something similar... 你可能谷歌“ runtime parser generation ”或类似的东西......

lex and yacc (or their GNU equivaents flex and bison ) do their work at compile time and may not be flexible enough for your needs. lex和yacc （或他们的GNU equivaents flex和bison ）在编译时完成他们的工作，可能不够灵活，无法满足您的需求。 (or they may, you're not very specific). （或者他们可能，你不是很具体）。

如何在C ++中解析字符串

问题描述

5 个解决方案

解决方案1
6

解决方案2
4 已采纳

解决方案3
4 2009-02-18 06:00:54

解决方案4
1

解决方案5
1

如何在C ++中解析字符串

问题描述

5 个解决方案

解决方案1 6

解决方案2 4 已采纳

解决方案3 4 2009-02-18 06:00:54

解决方案4 1

解决方案5 1

解决方案1
6

解决方案2
4 已采纳

解决方案3
4 2009-02-18 06:00:54

解决方案4
1

解决方案5
1