正则表达式替换

Question

我有一个字符串：

std::string String = "<!\\[LOG\\[somestringhere\\]LOG\\]!><time=\"12:34:30.0+120\" date=\"9-14-2015\" component=\"mycomponenet\" context=\"\" type=\"1\" thread=\"0\" file=\"mxyfile.cpp\"><!\\[LOG\\[somestringhere\\]LOG\\]!><time=\"12:34:30.0+120\" date=\"9-14-2015\" component=\"mycomponenet\" context=\"\" type=\"1\" thread=\"0\" file=\"mxyfile.cpp\">";

我想在><![LOG[ >符号后插入\\n char。

到目前为止，我的代码：

#include <regex>

const std::tr1::regex pattern( "(>|\")<!\\[LOG\\[" );
std::string replace = ">\n<![LOG[";
std::string newtext = std::tr1::regex_replace( String, pattern, replace );
std::cout << newtext << std::endl;

这很好用，但是不幸的是有一个小问题。 并非每行都以>结尾。 在某些情况下，应保留\\"<!\\\\[LOG\\\\[而不是><!\\\\[LOG\\\\[ 。

如果缺少最后一个> ，则结果将为"\\n<![LOG[而不是>\\n<![LOG[ 。

所以我的问题是，解决这个问题的最简单/最佳方法是什么？ 我应该以某种方式检查模式是否存在，然后相应地设置替换字符串吗？

希望我想要什么是可以理解的。

谢谢。

更新：
抱歉，但是正如我看到的那样，我在字符串的外观上犯了一个错误，这引起了一些误解。 以下是日志文件中的字符串（我将日志文件读入std :: string并对其进行处理）。 这实际上是两行，但是缺少换行符，这就是我要插入的内容。

情况1：
字符串如下所示：
<![LOG[somestring]LOG]!><time="12:34:30.0+120" date="9-14-2015" component="mycomponent" context="" type="1" thread="0" file="myfile.cpp"><![LOG[somestring]LOG]!><time="12:34:30.0+120" date="9-14-2015" component="mycomponent" context="" type="1" thread="0" file="myfile.cpp">

由此，我想得到一个结果：
<![LOG[somestring]LOG]!><time="12:34:30.0+120" date="9-14-2015" component="mycomponent" context="" type="1" thread="0" file="myfile.cpp">**LineBreakHere** <![LOG[somestring]LOG]!><time="12:34:30.0+120" date="9-14-2015" component="mycomponent" context="" type="1" thread="0" file="myfile.cpp">

请注意换行符应该在哪里。

情况2：该字符串几乎是以下内容：
<![LOG[somestring]LOG]!><time="12:34:30.0+120" date="9-14-2015" component="mycomponent" context="" type="1" thread="0" file="myfile.cpp"<![LOG[somestring]LOG]!><time="12:34:30.0+120" date="9-14-2015" component="mycomponent" context="" type="1" thread="0" file="myfile.cpp"

请注意，在file="myfile.cpp"之后缺少>

如果是这样，我希望获得与以前相同的结果：
<![LOG[somestring]LOG]!><time="12:34:30.0+120" date="9-14-2015" component="mycomponent" context="" type="1" thread="0" file="myfile.cpp">**LineBreakHere and the missing > was also inserted** <![LOG[somestring]LOG]!><time="12:34:30.0+120" date="9-14-2015" component="mycomponent" context="" type="1" thread="0" file="myfile.cpp"> **also inserted missing >**

所以基本上，我想插入一个换行符，如果缺少>我也想插入它，如果可能的话。

Answer 1

您的正则表达式应该看起来像

"(>|\")<!\\\\\\[LOG\\\\\\["

\\的4个斜杠和转义方括号的2个斜杠。 编写正则表达式的更好方法是使用R"(...)"表示法（“原始字符串文字”）：

const std::regex pattern( R"((>|\")<!\\\[LOG\\\[)" );

代码将是：

const std::regex pattern( R"((>|\")<!\\\[LOG\\\[)" );
std::string String = "<!\\[LOG\\[somestringhere\\]LOG\\]!><time=\"12:34:30.0+120\" date=\"9-14-2015\" component=\"mycomponenet\" context=\"\" type=\"1\" thread=\"0\" file=\"mxyfile.cpp\"><!\\[LOG\\[somestringhere\\]LOG\\]!><time=\"12:34:30.0+120\" date=\"9-14-2015\" component=\"mycomponenet\" context=\"\" type=\"1\" thread=\"0\" file=\"mxyfile.cpp\">";
std::string replace = "$1\n<![LOG[";
std::string newtext = std::regex_replace( String, pattern, replace );
std::cout << newtext << std::endl;

下nextext是

<!\[LOG\[somestringhere\]LOG\]!><time="12:34:30.0+120" date="9-14-2015" component="mycomponenet" context="" type="1" thread="0" file="mxyfile.cpp">
<![LOG[somestringhere\]LOG\]!><time="12:34:30.0+120" date="9-14-2015" component="mycomponenet" context="" type="1" thread="0" file="mxyfile.cpp">

请注意，替换字符串现在包含对第一个捕获组的后向引用$1 （该括号与子模式在括号(<|\\")匹配，并且我们可以安全地将其恢复到替换组中。这就是我与反斜杠。

IDEONE演示

正则表达式演示

更新：

您可以使用R"((<!\\[LOG\\[[\\s\\S]*?\\]!><[^<]*)(\\">?))"正则表达式：

const std::regex pattern( R"((<!\[LOG\[[\s\S]*?\]!><[^<]*)(\">?))" );
std::string String = "<![LOG[somestring]LOG]!><time=\"12:34:30.0+120\" date=\"9-14-2015\" component=\"mycomponent\" context=\"\" type=\"1\" thread=\"0\" file=\"myfile.cpp\"<![LOG[somestring]LOG]!><time=\"12:34:30.0+120\" date=\"9-14-2015\" component=\"mycomponent\" context=\"\" type=\"1\" thread=\"0\" file=\"myfile.cpp\"";
std::string replace = "$1\">\n";
std::string newtext = std::regex_replace( String, pattern, replace );
std::cout << newtext << std::endl;

Ideone演示

正则表达式说明：

该模式有2个捕获组：一个捕获<![LOG[的开始，直到下一个节点的末尾（ (<!\\[LOG\\[[\\s\\S]*?\\]!><[^<]*) ），以及另一个使用右尖括号或仅引用(">|")捕获引用的引用。

(<!\\[LOG\\[ -按字面匹配<![LOG[ （第一个捕获组的开始）
[\\s\\S]*? -匹配0个或多个任何字符（甚至换行符）
\\]!>< -匹配]!><从字面上看
[^<]*) -匹配除< （第一个捕获组的末尾）以外的0个或更多字符
(\\">|\\") -匹配并捕获">或" 。 您可以将其写为(\\">?) 。

正则表达式替换

问题描述

1 个解决方案

解决方案1
1 已采纳 2015-09-18 14:34:25

正则表达式替换

问题描述

1 个解决方案

解决方案1 1 已采纳 2015-09-18 14:34:25

解决方案1
1 已采纳 2015-09-18 14:34:25