在 Logstash 中过滤 jdbc 数据

Question

In my DB, I've data in below format:在我的数据库中，我有以下格式的数据：

But in ElasticSearch I want to push data with respect to item types.但在ElasticSearch中，我想推送关于项目类型的数据。 So each record in ElasticSearch will list all item names & its values per item type.因此 ElasticSearch 中的每条记录都将列出所有项目名称及其每个项目类型的值。

Like this:像这样：

{
  "_index": "daily_needs",
  "_type": "id",
  "_id": "10",
  "_source": {
    "item_type: "10",
    "fruits": "20",
    "veggies": "32",
    "butter": "11",
  }
}

{
  "_index": "daily_needs",
  "_type": "id",
  "_id": "11",
  "_source": {
    "item_type: "11",
    "hair gel": "50",
    "shampoo": "35",
  }
}

{
  "_index": "daily_needs",
  "_type": "id",
  "_id": "12",
  "_source": {
    "item_type: "12",
    "tape": "9",
    "10mm screw": "7",
    "blinker fluid": "78",
  }
}

Can I achieve this in Logstash ?我可以在Logstash中实现这一点吗？

I'm new into Logstash, but as per my understanding it can be done in filter .我是 Logstash 的新手，但据我了解，它可以在filter中完成。 But I'm not sure which filter to use or do I've to create a custom filter for this.但我不确定要使用哪个过滤器，或者我是否必须为此创建一个自定义过滤器。

Current conf example:当前配置示例：

input {
  jdbc {
    jdbc_driver_library => "ojdbc6.jar"
    jdbc_driver_class => "Java::oracle.jdbc.driver.OracleDriver"
    jdbc_connection_string => "myjdbc-configs"
    jdbc_user => "dbuser"
    jdbc_password => "dbpasswd"
    schedule => "* * * * *"
    statement => "SELECT * from item_table"
  }
}
filter {
    ## WHAT TO WRITE HERE??
}
output {
    elasticsearch {
        hosts => [ "http://myeshost/" ]
        index => "myindex"
    }
}

Kindly suggest.请建议。 Thank you.谢谢你。

Answer 1

You can achieve this using aggregate filter plugin .您可以使用聚合过滤器插件来实现这一点。 I have not tested below, but should give you an idea.我没有在下面测试过，但应该给你一个想法。

 filter {     
      aggregate {
        task_id => "%{item_type}" #
        code => "
          map['Item_type'] = event.get('Item_type')
          map[event.get('Item_Name')] = map[event.get('Item_Value')]
        "
        push_previous_map_as_event => true
        timeout => 3600
        timeout_tags => ['_aggregatetimeout']
      }
      if "aggregated" not in [tags] {
        drop {}
      }
    }

Important Caveats for using aggregate filter :使用聚合过滤器的重要注意事项：

The sql query MUST order the results by Item_Type, so the events are not out of order. sql 查询必须按 Item_Type 对结果进行排序，因此事件不会乱序。
Column names in sql query should match the column names in the filter map[] sql 查询中的列名应与过滤器map[]
You should use ONLY ONE worker thread for aggregations otherwise events may be processed out of sequence and unexpected results will occur.您应该只使用一个工作线程进行聚合，否则可能会乱序处理事件并出现意外结果。

在 Logstash 中过滤 jdbc 数据

问题描述

1 个解决方案

解决方案1
2 2019-10-02 14:21:41

在 Logstash 中过滤 jdbc 数据

问题描述

1 个解决方案

解决方案1 2 2019-10-02 14:21:41

解决方案1
2 2019-10-02 14:21:41