简体   繁体   English

是否需要有关使用非捕获组的Grok模式的信息(?:)

[英]Need information on Grok patterns that use non capturing group (?: )

I understand the concept of writing regular expressions using capturing and non-capturing groups. 我了解使用捕获组和非捕获组编写正则表达式的概念。

Ex: 例如:

a(b|c) would match and capture ab and ac a(b|c)将匹配并捕获abac

a(?:b|c) would match ab and ac but capture a a(?:b|c)将匹配abac但捕获a

But how is it useful when I make a new custom grok pattern and what it means to use non-capturing groups. 但是,当我制作一个新的自定义grok模式时它有什么用,以及使用非捕获组的含义是什么。

Looking at a few existing grok patterns like the one below for HOUR: 在下面的HOUR中查看一些现有的grok模式:

HOUR (?:2[0123]|[01]?[0-9])

Here we can match the hour format using (2[0123]|[01]?[0-9]) as well. 在这里,我们也可以使用(2[0123]|[01]?[0-9])来匹配小时格式。 What makes the grok pattern use the non-capturing expression here? 是什么使grok模式在这里使用非捕获表达式? Based on what parameters should I decide to use this (?:subex) 基于什么参数,我应该决定使用此(?:subex)

The difference between a pattern with a capturing group or without in Grok is whether you need to create a field or not. 具有捕获组的模式与不具有Grok的模式之间的区别在于是否需要创建字段。

The (?:2[0123]|[01]?[0-9]) pattern contains a non-capturing group that is only used for grouping subpattern sequences. (?:2[0123]|[01]?[0-9])模式包含一个非捕获组,该组仅用于对子模式序列进行分组 The (2[0123]|[01]?[0-9]) regex contains a numbered capturing group that matches and captures the value (=stores in some additional buffer with ID equal to the order of the capture group in the pattern). (2[0123]|[01]?[0-9])正则表达式包含一个编号的捕获组 ,该捕获组匹配并捕获该值(=存储在ID等于模式中捕获组顺序的某些其他缓冲区中) 。 Mind that there are also named capture groups , like (?<field>2[0123]|[01]?[0-9]) that assigns the value captured to a named group. 请注意,还有命名捕获组 ,例如(?<field>2[0123]|[01]?[0-9])会将捕获的值分配给命名组。

With named_captures_only parameter set to false , a(b|c) regex will match ab or ac and assign a b or c to a separate field. named_captures_only参数设置为false时a(b|c)正则表达式将匹配abac并将bc分配给单独的字段。 When you use a non-capturing group a(?:b|c) , no field will ever get created, this text will only be matched. 当您使用非捕获组 a(?:b|c) ,将不会创建任何字段,只会匹配此文本。

Since named_captures_only parameter default value is True , the difference between a numbered capturing or non-capturing group is removed in Grok patterns. 由于named_captures_only参数的默认值为True ,因此在Grok模式中已删除编号的捕获组或未捕获组之间的差异。 So, by default only named captures (like a(?<myfield>b|c) ) can be used to create fields. 因此,默认情况下只能使用命名捕获(如a(?<myfield>b|c) )创建字段。

I think the preference is given to non-capturing groups in common Grok patterns in order not to depend on the named_captures_only parameter setting. 我认为,优先选择普通Grok模式中的非捕获组,以便不依赖named_captures_only参数设置。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM