简体繁体 English

弹性搜索：基于模式聚合文档

[英]Elastic Search : Aggregating documents based on patterns

原文 2017-06-23 08:30:11 0 1 elasticsearch/ aggregation

I have four documents in my index. 我的索引中有四个文件。 These are as follows. 这些如下。

Client(::ffff:10.0.0.6:27787) Connected 客户端（:: ffff：10.0.0.6：27787）已连接

Client(::ffff:10.0.0.6:27805) Connected 客户端（:: ffff：10.0.0.6：27805）已连接

Client(::ffff:10.0.0.6:27823) Connected 客户端（:: ffff：10.0.0.6：27823）已连接

Client(::ffff:10.0.0.6:27875) Connected 客户端（:: ffff：10.0.0.6：27875）已连接

================= =================

I hope to aggregate these as follows. 我希望将这些汇总如下。

Client(::ffff:10.0.0.6:_____) Connected 客户端（:: ffff：10.0.0.6：_____）已连接

There are many documents with different patterns in an index, not only the above document pattern, and I hope to get all patterns by aggregating all documents in an index. 索引中有许多文档具有不同的模式，不仅是上面的文档模式，而且我希望通过汇总索引中的所有文档来获得所有模式。

How can I do it with ES? 如何使用ES？ If ES does not directly support such aggregations, any idea to do this? 如果ES不直接支持这种聚合，那么有什么想法吗？

Regards, Kangmo 问候，姜末

1 个解决方案

you need to split the data you index into specific fields for, in order to reliable aggregate on your data. 您需要将索引的数据分为特定的字段，以便可靠地汇总数据。

So, you need to have a process, that splits the string Client(::ffff:10.0.0.6:27787) Connected into ::ffff:10.0.0.6 and puts that content into its own field. 因此，您需要一个将字符串Client(::ffff:10.0.0.6:27787) Connected成::ffff:10.0.0.6并将其内容放入其自身字段的过程。

One possible way to do this is to use the ingest node. 一种可行的方法是使用摄取节点。 The ingest node has a so-called grok processor to extract structured data from text. 摄取节点具有所谓的grok处理器，用于从文本中提取结构化数据。

With this you could extract the ip address and the index it into its own field (you could also use the ip data type, so you can do CIDR style range queries if needed) - and then you can aggregate easily on that data. 这样，您可以提取ip地址并将其索引到其自己的字段中（也可以使用ip数据类型，因此可以根据需要执行CIDR样式范围查询）-然后可以轻松地对该数据进行汇总。

Hope this helps! 希望这可以帮助！

--Alex --Alex