简体   繁体   English

C ++正则表达式,解析

[英]C++ regex, parsing

I'm pretty new with regexp and I can't get my function doing what I would like. 我对regexp很陌生,我无法让函数执行我想做的事情。
I have a long string, and I want to extract from it, 3 variables. 我有一个长字符串,我想从中提取3个变量。

My string looks : 我的弦看起来:

 Infoname/info :   
 Input_Device_Name GTape  Buffer_Size 16384 Acquisition_Event_Rate 163691.000000  
 Acquisition_Buffer_Rate 14873.333008 Acquisition_Succes_Rate 100.000000

And my goal is to store 163691.000000, 14873.333008 and 100.000000 in three differents variables. 我的目标是将163691.000000、14873.333008和100.000000存储在三个差异变量中。

What is the fastest and nicest way to do it please ? 请问最快最快的方法是什么?

Thank you, 谢谢,
eo eo

You could use the following regex to look for it: 您可以使用以下正则表达式进行查找:

"Input_Device_Name\s+GTape\s+Buffer_Size\s+[0-9.]+\s+Acquisition_Event_Rate\s+([0-9.]+)\s+Acquisition_Buffer_Rate\s+([0-9.]+)\s+Acquisition_Succes_Rate\s+([0-9.]+)"

This should catch the three values assuming that your text stays the same and that your numbers always take this form (ie are positive and not in exponential form.) Note that only the last three numbers are captured by putting brackets round them. 假定您的文本保持不变并且数字始终采用这种形式(即正数而不是指数形式),这应该捕获三个值。请注意,将括号括起来只能捕获最后三个数字。

If you use boost regex, you could do something like this: 如果使用boost regex,则可以执行以下操作:

#include <boost/regex.hpp>

... ...

boost::smatch what;
static const boost::regex pp("Input_Device_Name\\s+GTape\s+Buffer_Size\\s+[0-9.]+\\s+Acquisition_Event_Rate\\s+([0-9.]+)\\s+Acquisition_Buffer_Rate\\s+([0-9.]+)\\s+Acquisition_Succes_Rate\\s+([0-9.]+)");
if ( boost::regex_match(inputTextString, what, pp) )
{
    if ( what.size() == 4 )
    {
         double d1 = strtod(static_cast<const string&>( what[1] ).c_str(), NULL, 0);
         double d2 = strtod(static_cast<const string&>( what[2] ).c_str(), NULL, 0);
         double d3 = strtod(static_cast<const string&>( what[3] ).c_str(), NULL, 0);

         // These are your doubles, do some stuff with them.
    }
}

Where inputTextString contains the line of text you want to parse, so if this is coming from a file say, you would want to place this code in a loop. 其中inputTextString包含要解析的文本行,因此,如果这是来自某个文件,则需要将此代码放入循环中。 The what variable is a vector of all the matching text though what[0] contains the whole line and so can be ignored unless you need it. what变量是所有匹配文本的向量,尽管what[0]包含整行,因此可以忽略,除非您需要它。 Last but not least, remember to double escape the 'space' character class otherwise it will already be processed (or generate an error or warning) by the compiler prior to being presented to the regex processor. 最后但并非最不重要的一点是,请记住对“空格”字符类进行两次转义,否则在呈现给regex处理器之前,编译器已经对其进行了处理(或生成错误或警告)。 Also, please note that I've not had time to compile this, though it is based on working code 另外,请注意,尽管它是基于工作代码的,但我没有时间对其进行编译

Watch out for trailing, leading space on your input file and use ^ and $ to mark the beginning or end of the line respectively if it helps. 注意输入文件上的尾随空格和前导空格,如果有帮助,请分别使用^$标记行的开头或结尾。

Just search for [0-9\\.]+ as long as it returns any results. 只要搜索[0-9\\.]+ ,只要它返回任何结果即可。 And, for example, if you would like to refuse 16384 as a variable you don't need, test every search result for having a dot in it. 并且,例如,如果您想拒绝16384作为不需要的变量,请测试每个搜索结果中是否包含点。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM