简体   繁体   English

正则表达式和字体锁定中的未知匹配数

[英]Unknown number of matches in regex and font-lock

I am trying to use font-lock and elisp's regex to highlight something like this:我正在尝试使用 font-lock 和 elisp 的正则表达式来突出显示如下内容:

class Foo implements A, B, C, D { }

The problem is unknown length of comma separated list after implements .问题是在implements之后逗号分隔列表的长度未知。 I've already done regex which highlights all words on list (using re-buider, A, B, C and D are highlighted):我已经完成了突出显示列表中所有单词的正则表达式(使用重新构建器,A、B、C 和 D 被突出显示):

"implements\\s-+\\(?:\\(\\sw+\\)\\s-*,\\s-*\\)*\\(\\sw+\\)"

but I'm unable to combine this with font-lock.但我无法将它与字体锁定结合起来。

Obviously明显地

'("implements\\s-+\\(?:\\(\\sw+\\)\\s-*,\\s-*\\)*\\(\\sw+\\)"
  (1 font-lock-type-face) (2 font-lock-type-face))

doesn't work, because it highlights only to last occurances (C and D) ignoring star ( * ) after first backreference.不起作用,因为它仅突出显示最后一次出现(C 和 D),忽略第一次反向引用后的星号( * )。

Is there a way to capture a list of all matched words or maybe enteirly different way to solve this problem?有没有办法捕获所有匹配单词的列表,或者可能完全不同的方法来解决这个问题?

If you don't want the commas to be highlighted too, your approach cannot work.如果您不希望逗号也被突出显示,那么您的方法就行不通。 When you use a subexp-highlighter of the form当您使用表单的subexp-highlighter

(subexp facespec)

the subexp refers to the sub-group of your regex and highlights with the given facespec . subexp是指您的正则表达式的子组,并使用给定的facespec突出显示。 Now, a sub-group of a regexp match is a continuous span of text with a beginning and end.现在,正则表达式匹配的子组是具有开头和结尾的连续文本范围。 In fact, whenever you do a regexp search, you can query those values with the functions (match-beginning subexp) and (match-end subexp) .事实上,无论何时进行正则表达式搜索,都可以使用(match-beginning subexp)(match-end subexp)函数查询这些值。

But that means that you cannot match a variable number of classnames excludings commas with a single sub-expression, because that sub-expression would have to be a continuous span.但这意味着您不能将可变数量的类名(不包括逗号)与单个子表达式匹配,因为该子表达式必须是连续的跨度。 And a continuous span that covers a variable number of classnames must always contain the commas, too, there's no way around that.覆盖可变数量的类名的连续跨度也必须始终包含逗号,这是没有办法的。

Here's another reason why your approach is not such a good ideas: your regexp explicitly uses whitespace.这是您的方法不是一个好主意的另一个原因:您的正则表达式明确使用空格。 It doesn't matter if the whitespace is excluded from highlighting, but even using it in the regexp is not such a great idea, because wherever whitespace is allowed, you could always encounter comments as well.是否将空格从突出显示中排除并不重要,但即使在正则表达式中使用它也不是一个好主意,因为只要允许空格,您也总是会遇到注释。

Consider the following code line:考虑以下代码行:

class Foo implements A, /*B, C,*/ D { }

In that case, you would want the characters in the span /*B, C,*/ to be highlighted using the font-lock-comment-face , and the surrounding classes in font-lock-type-face .在这种情况下,您可能希望使用font-lock-comment-facefont-lock-type-face中的周围类突出显示范围/*B, C,*/中的字符。 You can still achieve this effect if you highlight comments only after everything else has already been hightlighted, and allow comments to override other font-lock matches.如果仅在其他所有内容都已突出显示之后才突出显示注释,并且允许注释覆盖其他字体锁定匹配项,您仍然可以实现此效果。 But this will lead to rather inefficient matching, because every comment would then first be highlighted as if it were code, and then be highlighted as a comment in a second pass.但这将导致相当低效的匹配,因为每个注释首先会像代码一样突出显示,然后在第二遍中作为注释突出显示。

A solution to both problems would probably be to divide the matching of the keywords ("implements") and the classes into two different matching rules, perhaps you could use as a starting point something along the lines of:这两个问题的解决方案可能是将关键字(“实现”)和类的匹配划分为两个不同的匹配规则,也许您可以使用以下内容作为起点:

'(("\\bimplements\\b" . font-lock-keyword-face)
  ("\\b[A-Z]\\w*\\b" . font-lock-type-face))

Something like this seems to work here:像这样的东西似乎在这里工作:

'("\\(implements\\)\\s-+\\(\\(\\sw+\\s-*,\\s-*\\)*\\sw+\\)"
   (1 font-lock-warning-face)
   (2 font-lock-keyword-face))

(and obviously you probably want different faces...) (显然你可能想要不同的面孔......)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM