简体   繁体   English

匹配Java正则表达式中被可选组包围的内容

[英]Match contents surrounded by optional group in Java regex

I'm having trouble wrapping my head around how a particular Java regex should be written. 我无法确定如何编写特定的Java正则表达式。 The regex will be used in a sequence, and will match sections ending with / . 正则表达式将按顺序使用,并将匹配以/结尾的部分。

The problem is that using a simple split won't work because the text before the / can optionally be surrounded by ~ . 问题在于使用简单的拆分将不起作用,因为/之前的文本可以选择由~包围。 If it is, then the text inside can match anything - including / and ~ . 如果是,则其中的文本可以匹配任何内容-包括/~ The key here is the ending ~/ , which is the only way to escape this 'anything goes' sequence if it begins with ~ . 此处的关键是~/结尾,如果它以~开头,则这是逃避此“一切”序列的唯一方法。

Because the regex pattern will be used in a sequence (ie (xxx)+ ), I can't use ^ or $ for non-greedy matching. 因为正则表达式模式将按顺序使用(即(xxx)+ ),所以我不能将^$用于非贪婪匹配。

Example matches: 示例匹配:

  • foo/
  • ~foo~/
  • ~foo/~/
  • ~foo~~/
  • ~foo/bar~/

and some that wouldn't match: 和一些不匹配的:

  • foo~//
  • ~foo~/bar~/
  • ~foo/
  • foo~/ (see edit 2) foo~/ (请参阅编辑2)

Is there any way to do this without being redundant with my regexes? 有什么方法可以做到这一点而又不用我的正则表达式吗? What would be the best way to think about matching this? 考虑与此匹配的最佳方法是什么? Java doesn't have a conditional modifier ( ? ) so that complicated things in my head a bit more. Java没有条件修饰符( ? ),因此使我头脑中的事情复杂得多了。

EDIT : After working on this in the meantime, the regex ((?:\\~())?)(((?!((?!\\2)/|\\~/)).)+)\\1/ gets close but #6 doesn't match. 编辑 :在此之后,正则表达式((?:\\~())?)(((?!((?!\\2)/|\\~/)).)+)\\1/ 得到关闭,但#6不匹配。

EDIT 2 : After Steve pointed out that there is ambiguity, it became clear #6 shouldn't match. 编辑2 :在史蒂夫指出存在歧义之后,很明显#6不匹配。

I don't think that this is a solvable problem. 我认为这不是一个可以解决的问题。 From your givens, these are all acceptable: 从您的给定来看,这些都是可以接受的:

~foo/~/
~foo/
foo~/

So, now, let's consider this combination: 因此,现在让我们考虑以下组合:

~foo/foo~/

What happens here? 这里会发生什么? We have combined the second example and the third example to create an instance of the first example. 我们结合了第二个示例和第三个示例,以创建第一个示例的实例。 How do you suggest a correct splitting? 您如何建议正确的分割? As far as I can tell, there's no way to tell if we should be taking the entire expression as one or two valid expressions. 据我所知,无法确定是否应将整个表达式作为一个两个有效表达式。 Hence, I don't think it's possible to break it up accurately based on your listed restrictions. 因此,根据您列出的限制,我认为无法将其准确分解。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM