简体   繁体   English

java正则表达式涉及正向后视和前瞻

[英]java regex involving positive lookbehind and lookahead

I'm trying to figure out a regex for to find <ANYTHING_BUT_WHITSPACE>?我想找出一个正则表达式来查找<ANYTHING_BUT_WHITSPACE>? OR ?<ANYTHING_BUT_WHITSPACE> and replace the ??<ANYTHING_BUT_WHITSPACE>并替换 ? with a blank space.有一个空格。

So, '?test test?'那么,“?测试测试?” should become 'test test'应该成为“测试测试”

Below is the regex i came up with;下面是我想出的正则表达式; but doesnt seem to work.但似乎不起作用。 Any suggestions?有什么建议?

s.replace("(?<=/S)?|?(?=/S)", "");
  • (?<=/S)? (?<=/S)? look for ?寻找 ? with positive look-behind of anything but whitespace (\\S)除了空格(\\S)之外的任何东西都具有积极的后视
  • | | or或者
  • ?(?=/S) look for ? ?(?=/S) 寻找 ? with positive lookahead of anything but whitespace (\\S)除了空格(\\S)之外的任何东西都具有积极的前瞻性

First of all your regex have some mistakes.首先,您的正则表达式有一些错误。 You used / instead of \\ .您使用/而不是\\ Second thing is escaping the characters.第二件事是逃避角色。

The regex you are looking for is (?<=\\S)\\?|\\?(?=\\S) and replace with empty string.您正在寻找的正则表达式是(?<=\\S)\\?|\\?(?=\\S)并替换为empty字符串。

Note: For Java use double escapes ie \\\\S and \\\\?注意:对于 Java 使用双转义,即\\\\S\\\\? . .

Regex101 Demo Regex101 演示

  • First the ?首先是? as a literal needs to be escaped since it is special regex character ( \\\\? instead of ? ).作为文字需要转义,因为它是特殊的正则表达式字符( \\\\?而不是? )。

  • You should use replaceAll instead of replace to replace using a regex.您应该使用replaceAll而不是replace来替换使用正则表达式。

  • Also make sure to re-assign the return value of replaceAll to a string because strings are immutable in Java.还要确保将replaceAll的返回值重新分配给字符串,因为字符串在 Java 中是不可变的。

  • The non-whitespace predefined character group is \\S not /S .非空白预定义字符组是\\S而不是/S

I think you need to escape the question mark if used as a literal in regex我认为如果在正则表达式中用作文字,您需要转义问号
expression because it is a metacharacters.表达式,因为它是一个元字符。

So, it's probably this you need所以,这可能是你需要的

 (?:
      (?<! \s )
      \?
   |  
      \?
      (?! \s )
 )

However, it may not look intuitive, but using an expression with a leading但是,它可能看起来不直观,而是使用带有前导的表达式
assertion greatly slows down the engine.断言大大减慢了引擎的速度。

To get better results, match the literal first then check with an assertion.为了获得更好的结果,首先匹配文字然后检查断言。

 \?
 (?:
      (?! \s )
   |  (?<! \s \? )
 )

Let's compare the relative performance using Benchmark software.让我们比较一下使用Benchmark软件的相对性能。

Input输入

wrgasgsagasf?afbafbadfbadfbadfbafdb
dddd? asfbasbfasfb
?asvgasgasgasgasg

Bench长椅

Regex1:   \?(?:(?!\s)|(?<!\s\?))
Options:  < none >
Completed iterations:   50  /  50     ( x 1000 )
Matches found per iteration:   3
Elapsed Time:    0.15 s,   154.76 ms,   154756 µs


Regex2:   (?:(?<!\s)\?|\?(?!\s))
Options:  < none >
Completed iterations:   50  /  50     ( x 1000 )
Matches found per iteration:   3
Elapsed Time:    0.89 s,   894.83 ms,   894834 µs

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM