简体   繁体   English

正则表达式匹配字符,但不在正则表达式内

[英]regex match character but not within regex

STOPATDESK YES;
:: TXT "LCLLMT:29.4700";
:: TXT "LCLCURR;NON-USD";
:: TXT "CALLBK:3";
:: TXT "FFTRL:EUR-LIM;-TAP-5";

STOPATDESK YES; :: TXT "LCLLMT:29.4700"; :: TXT "LCLCURR;NON-USD"; :: TXT "CALLBK:3"; :: TXT "FFTRL:EUR-LIM;-TAP-5";

Could you please provide regex that will match semicolons but not within TXT "..."? 您能否提供与分号匹配但不在TXT“ ...”之内的正则表达式?

There were several useful questions on StackOverflow but I failed to compile working solution for my case 关于StackOverflow有几个有用的问题,但我无法为我的案例编译有效的解决方案
Regex for matching a character, but not when it's enclosed in square bracket 正则表达式,用于匹配字符,但不包括在方括号中时
Regex for matching a character, but not when it's enclosed in quotes 正则表达式,用于匹配字符,但不能用引号引起来

You need a regex that matches any semicolon that is not followed by an odd number of quotes. 您需要一个正则表达式,该正则表达式必须与任何后跟奇数引号的分号匹配。

;(?![^"]*(([^"]*"[^"]*"){2})*[^"]*"[^"]*$)

The tricky part is to find the negative lookahead (?![^"]*(([^"]*"[^"]*"){2})*[^"]*"[^"]*$) : 棘手的部分是找到否定的前瞻 (?![^"]*(([^"]*"[^"]*"){2})*[^"]*"[^"]*$)

  • [^"]* match any text before the first " after ; [^"]*匹配第一个" ;之后的任何文本;
  • (([^"]*"[^"]*"){2})* match any even number of quotes with text inside (([^"]*"[^"]*"){2})*匹配任意偶数引号,且其中的文本
  • [^"]*"[^"]*$ match the last quote [^"]*"[^"]*$匹配最后一个报价

If all the above conditions are matched, then an odd number of " is found after ; . That implies that the ; is inside two " and therefore it's not a valid ; 如果上述所有条件相匹配,则奇数"被发现后; 。这意味着;是内部的两个" ,因此这不是一个有效的; .

Regex: https://regex101.com/r/dG6cC1/1 正则表达式: https//regex101.com/r/dG6cC1/1

Java demo: https://ideone.com/OuAaA5 Java演示: https//ideone.com/OuAaA5

You can also try with: 您也可以尝试:

"[^"]*"|(;)

DEMO 演示

which will match quotes or separate semicolons, then get separate semicolons with group(1) . 它将匹配引号或单独的分号,然后使用group(1)获得单独的分号。 However the unbalanced quoting sings would couse a problem. 但是,不平衡的报价唱歌会引起问题。 Or, if whole file is formated as your example (semicolons in quotation are preceded and followed by another character, not whitespace), you can try with: 或者,如果将整个文件格式化为您的示例格式(引号中的分号在其后,然后是另一个字符,而不是空格),则可以尝试以下操作:

;(?=\s|$)

DEMO 演示

It works with example above. 它适用于上面的示例。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM