简体   繁体   English

C ++中的QRegExp捕获字符串的一部分

[英]QRegExp in C++ to capture part of string

I am attempting to use Qt to execute a regex in my C++ application. 我正在尝试使用Qt在C ++应用程序中执行正则表达式。 I have done similar regular expressions with Qt in C++ before, but this one is proving difficult. 之前,我在C ++中用Qt做过类似的正则表达式,但是事实证明这很困难。

Given a string with optional _# at the end of the string, I want to extract the part of the string before that. 给定一个在字符串末尾带有可选_#的字符串,我想在此之前提取字符串的一部分。

Examples: 例子:

"blue_dog" should result "blue_dog"
"blue_dog_1" should result "blue_dog"
"blue_dog_23" should result "blue_dog"

This is the code I have so far, but it does not work yet: 这是我到目前为止的代码,但尚无法使用:

QString name = "blue_dog_23";
QRegExp rx("(.*?)(_\\d+)?");    
rx.indexIn(name);
QString result = rx.cap(1);  

I have even tried the following additional options in many variations without luck. 我什至没有尝试过许多变体中的以下其他选项。 My code above always results with "": 我上面的代码始终以“”结尾:

rx.setMinimal(TRUE);   
rx.setPatternSyntax(QRegExp::RegExp2);

Sometimes it's easier not to pack everything in a single regexp. 有时,将所有内容打包在一个正则表达式中会更容易。 In your case, you can restrict manipulation to the case of an existing _# suffix. 根据您的情况,可以将操作限制为现有_#后缀的情况。 Otherwise the result is name : 否则结果为name

QString name = "blue_dog_23";
QRegExp rx("^(.*)(_\\d+)$");
QString result = name;
if (rx.indexIn(name) == 0)
    result = rx.cap(1);

Alternatively, you can split the last bit and check if it is a number. 或者,您可以拆分最后一位并检查它是否为数字。 A compact (but maybe not the most readable) solution: 紧凑(但可能不是最易读)的解决方案:

QString name = "blue_dog_23";
int i = name.lastIndexOf('_');
bool isInt = false;
QString result = (i >= 0 && (name.mid(i+1).toInt(&isInt) || isInt)) ? name.left(i) : name;

The following solution should work as you want it to! 以下解决方案应按您希望的那样工作!

^[^\\s](?:(?!_\\d*\\n).)*/gm

Basically, that is saying match everything up to, but not including, _\\d*\\n . 基本上,就是说匹配所有内容,但不包括_\\d*\\n Here, _\\d*\\n means match the _ char, then match any number of digits \\d* until a new line marker, \\n is reached. 在这里, _\\d*\\n表示匹配_ char,然后匹配任意数量的数字\\d*直到到达新的行标记\\n ?! is a negative lookahead, and ?: is a non-capturing group. 是一个否定的前瞻,而?:是一个非捕获组。 Basically, the combination means that the sequence after the ?: is the group representing the non-inclusive end point of the what should be captured. 基本上,组合表示?:之后的序列是代表应捕获内容的非包含端点的组。

The ^[^\\s] tells the expression to match starting at the start of a line, as long as the first character isn't a white space. 只要第一个字符不是空格, ^[^\\s]告诉表达式从行首开始匹配。

The /gm sets the global flag (allowing more than one match to be returned) and the mutli-line flag (which allows sequences to match past a single line. /gm设置全局标志(允许返回多个匹配项)和多行标志(允许序列匹配超过一行)。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM