简体   繁体   中英

Import data from mysql with json column to elastic search

I have a column in MySQL with json in one of the column, I have to implement search on this column with multiple keys. I tried using log stash to create an index using Mysql.

Here is my log stash configuration. Info is the column with type text and json pairs in the form of text

input {
  jdbc {
    jdbc_connection_string => "jdbc:mysql://localhost:3306/dbname"
    # The user we wish to execute our statement as
    jdbc_user => "user"
    jdbc_password => "password"
    # The path to our downloaded jdbc driver
    jdbc_driver_library => "/usr/share/java/mysql-connector-java-5.1.38.jar"
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    # our query
    statement => "SELECT info FROM organization"
    }
  }
output {
  stdout { codec => json_lines }
  elasticsearch {
  "hosts" => "localhost:9200"
  "index" => "new_index"
  "document_type" => "doc"
  }
}

I tried creating a mapping of the index and set one of the fields as nested in the mapping, But nothing was uploaded to my index. A raw update from MySQL to index treat my json as text, which makes it harder to search. Anyone have a better solution to update json column into an index so that I can search from the key.

Output.

{
  "check_index" : {
    "aliases" : { },
    "mappings" : {
      "doc" : {
        "properties" : {
          "@timestamp" : {
            "type" : "date"
          },
          "@version" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          },
          "info" : {
            "type" : "text",
            "fields" : {
              "keyword" : {
                "type" : "keyword",
                "ignore_above" : 256
              }
            }
          }
        }
      }
    },
    "settings" : {
      "index" : {
        "creation_date" : "1528870439037",
        "number_of_shards" : "5",
        "number_of_replicas" : "1",
        "uuid" : "MkNrBMD8S8GYfDtxRyOFfg",
        "version" : {
          "created" : "6020499"
        },
        "provided_name" : "check_index"
      }
    }
  }
}

there is info is my JSON string. Under which I have many key values for eg: address, names etc in the json, so instead of a separate column of such field I have created a json for the same and added it in the column. But i cant search on that json.

I think what you are looking for is JSON filter . Just add your column name which is of JSON type inside that JSON filter. Let's say if the column with datatype JSON is info , your filter will look something like below.

filter {
  json {
    source => "info"
    }
}

If you have multiple columns with JSON datatype, you can repeat your json dict within filter . So for the JSON column info , your final logstash config will look something like below.

input {
  jdbc {
      jdbc_connection_string => "jdbc:mysql://localhost:3306/dbname"
      # The user we wish to execute our statement as
      jdbc_user => "user"
      jdbc_password => "password"
      # The path to our downloaded jdbc driver
      jdbc_driver_library => "/usr/share/java/mysql-connector-java-5.1.38.jar"
      jdbc_driver_class => "com.mysql.jdbc.Driver"
      # our query
      statement => "SELECT info FROM organization"
  }
} 
filter {
  json {
    source => "info"
    }
}
output {
  elasticsearch {
  "hosts" => "localhost:9200"
  "index" => "new_index"
  "document_type" => "doc"
  }
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM