简体   繁体   English

如何在C ++中解析字符串

[英]How parse a string in C++

I want to parse the strings, so that to check whether they have specified syntax or not. 我想解析字符串,以便检查它们是否具有指定的语法。

Example: 例:

Str = Z344-R565l t

Here my requirement is after Z there should be a number and after that a - and after that R should be there followed by a number, followed by l , followed by a space and then finally t . 在这里,我的要求是在Z之后应该有一个数字,然后在那之后-并且之后R应该在那里跟着一个数字,然后是l ,接着是一个空格然后最后是t

If any thing other than this it should be a error. 如果除此之外的任何事情应该是一个错误。

I have to parse many different kind of syntax like this. 我必须解析许多不同类型的语法。 I would be awkward if write a function for each type of syntax required. 如果为每种语法类型编写一个函数,我会很尴尬。 I heard that yacc or lex can solve this problem. 我听说yacc或lex可以解决这个问题。

Can any one please throw some light on my problem? 任何人都可以对我的问题有所了解吗?

You do this with a regex. 你用正则表达式做到这一点。

Z344-R565l t Z344-R565l t

Your regex should look something like this. 你的正则表达式应该是这样的。 Not sure what regex library to use for c++, but this is the general regex to make sure that your string matches. 不确定要用于c ++的正则表达式库,但这是确保字符串匹配的一般正则表达式。

Z[0-9]+-R[0-9]+l t

Use boost::regex 使用boost :: regex

#include <string>
#include <boost/regex.hpp>

bool isMatch(std::string input){
    boost::regex r("Z[0-9]*-R[0-9]*l t");
    return boost::regex_search(input, r);
}

The other thing that you could do is supply a list of regex expressions in a file, one expression per line. 您可以做的另一件事是在文件中提供正则表达式列表,每行一个表达式。 Create a vector of boost::regex objects using the file input and iterate through the vector of patterns for each string you need to validate. 使用文件输入创建boost :: regex对象的向量,并遍历需要验证的每个字符串的模式向量。 It's not very efficient but it will work. 它不是很有效但它会起作用。

Boost::Regex is fine if you just want to check the syntax. 如果您只是想检查语法,Boost :: Regex就可以了。 If you want to actually do something when you read such an expression, i suggest you use Boost::Spirit with something like : 如果你想在阅读这样的表达式时真正做某事,我建议你使用Boost :: Spirit,例如:

rule<> testFormula = 
    (ch_p('Z') >> int_p) 
    >> (ch_p('-')>>ch_p('R')>>int_p>>ch_p('l')) 
    >> space_p >> ch_p('t');

I have isolated parts of the expression which you might want to connect to some action (using [] operator). 我有一些表达式的孤立部分,你可能想要连接到某个动作(使用[]运算符)。

See the documentation for more information 有关更多信息,请参阅文档

If you have a recent compiler (VC 2008 SP1, etc.), there's no need to use boost, regex are part of TR1 and you can use them with this header: #include <regex> 如果您有最新的编译器(VC 2008 SP1等),则无需使用boost,regex是TR1的一部分,您可以将它们与此标头一起使用: #include <regex>

Example for a date ( you should use double \\ as escape character ): 日期示例( 您应该使用double \\作为转义字符 ):

string dateorder = "12/07/2009";

tr1::regex expr("^([1-2][0-9]|0?[1-9]|30|31)/([1-9]|10|11|12)/(2\\\d{3})$");

if (!regex_match(dateorder.begin(),dateorder.end(),expr))
{
    ...
    break;  
}

You might google " runtime parser generation " or something similar... 你可能谷歌“ runtime parser generation ”或类似的东西......

lex and yacc (or their GNU equivaents flex and bison ) do their work at compile time and may not be flexible enough for your needs. lexyacc (或他们的GNU equivaents flexbison )在编译时完成他们的工作,可能不够灵活,无法满足您的需求。 (or they may, you're not very specific). (或者他们可能,你不是很具体)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM