简体   繁体   English

使用反向引用来引用模式而不是实际匹配

[英]Using backreference to refer to a pattern rather than actual match

I am trying to write a regex which would match a (not necessarily repeating) sequence of text blocks, eg: 我正在尝试编写一个匹配(不一定重复)文本块序列的正则表达式,例如:

foo,bar,foo,bar

My initial thought was to use backreferences, something like 我最初的想法是使用反向引用,例如

(foo|bar)(,\\1)*

But it turns out that this regex only matches foo,foo or bar,bar but not foo,bar or bar,foo (and so on). 但事实证明,这个正则表达式只匹配foo,foobar,bar但不匹配foo,foo foo,barbar,foo (依此类推)。

Is there any other way to refer to a part of a pattern? 有没有其他方法可以引用模式的一部分?

In the real world, foo and bar are 50+ character long regexes and I simply want to avoid copy pasting them to define a sequence. 在现实世界中, foobar是50多个字符长的正则表达式,我只是想避免复制粘贴它们来定义序列。

With a decent regex flavor you could use (foo|bar)(?:,(?-1))* or the like. 有了正常的正则表达式,你可以使用(foo|bar)(?:,(?-1))*等。 But Java does not support subpattern calls. 但Java不支持子模式调用。

So you end up having a choice of doing String replace/format like in ajx's answer, or you could condition the comma if you know when it should be present and when not. 因此,您最终可以选择在ajx的答案中执行字符串替换/格式化,或者如果您知道它何时应该存在且何时不存在,则可以调整逗号。 For example: 例如:

(?:(?:foo|bar)(?:,(?!$|\s)|))+

Perhaps you could build your regex bit by bit in Java, as in: 也许你可以在Java中逐位构建你的正则表达式,如:

String subRegex = "foo|bar";
String fullRegex = String.format("(%1$s)(,(%1$s))*", subRegex);

The second line could be factored out into a function. 第二行可以分解为函数。 The function would take a subexpression and return a full regex that would match a comma-separated list of subexpressions. 该函数将采用子表达式并返回与逗号分隔的子表达式列表匹配的完整正则表达式。

The point of the back reference is to match the actual text that matches, not the pattern, so I'm not sure you could use that. 后引用的要点是匹配匹配的实际文本,而不是模式,因此我不确定您是否可以使用它。

Can you use quantifiers like: 你可以使用如下的量词:

    String s= "foo,bar,foo,bar";
            String externalPattern = "(foo|bar)"; // comes from somewhere else
            Pattern p = Pattern.compile(externalPattern+","+externalPattern+"*");
    Matcher m = p.matcher(s);
    boolean b = m.find();

which would match 2 or more instances of foo or bar (followed by commas) 这将匹配2个或更多的foo或bar实例(后跟逗号)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 使用工厂方法模式而不是简单工厂的动机是什么 - What is the motivation of using factory method pattern rather than simple factory 使用正则表达式生成字符串而不是匹配它们 - Using Regex to generate Strings rather than match them 获取实际位置而不是模拟位置 - Get actual location rather than mocked location 为什么PRG模式而不是其他? - Why PRG pattern rather than others? Java控制台显示对象地址而不是实际值 - Java console displaying address of object rather than actual value 获取用于MethodInvocation而不是声明类的实际类 - Getting the actual class used for a MethodInvocation rather than the declaring class 从视图(而非活动)管理清理…单例模式的危险? - Managing cleanup from views (rather than Activities)… Dangers of singleton pattern? 将Spring配置为目录结构而不是* .html或其他模式 - Configuring Spring to directory structure rather than *.html or other pattern 如何在带有Java的Selenium Webdriver中通过标题列名称而不是索引/键来引用Excel列? - How can I refer to Excel columns by their header names rather than indexes/keys in Selenium Webdriver with Java? 使用单词而不是数字的Java数组 - Java array using words rather than numbers
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM