简体   繁体   中英

Logstash Reading Same Data (Duplicates)

I'm using logstash input jdbc plugin to read one database and send the data to elasticsearch. My Logstash.conf file looks like this:

input {
jdbc {
    jdbc_driver_library => "${LOGSTASH_JDBC_DRIVER_JAR_LOCATION}"
    jdbc_driver_class => "${LOGSTASH_JDBC_DRIVER}"
    jdbc_connection_string => "${LOGSTASH_JDBC_URL}"
    jdbc_user => "${LOGSTASH_JDBC_USERNAME}"
    jdbc_password => "${LOGSTASH_JDBC_PASSWORD}"
    schedule => "* * * * *"
    statement => "select * from testtable"
    use_column_value => true
    tracking_column => "time"
}
}

filter {
  mutate {
    add_field => { "message" => "%{time}" }
          convert => [ "time", "string" ]
  }
  date {
      timezone => "Etc/GMT+3"
      match => ["time" , "ISO8601", "yyyy-MM-dd HH:mm:ss.SSS"]
      target => "@timestamp"
      remove_field => [ "time", "timestamp" ]
  }
  fingerprint {
    source => ["testid", "programid", "unitid"]
    target => "[@metadata][fingerprint]"
    method => "MD5"
    key => "${LOGSTASH_JDBC_PASSWORD}"
  }
  ruby {
    code => "event.set('[@metadata][tsprefix]', event.get('@timestamp').to_i.to_s(16))"
  }
}

output {
    elasticsearch {
        hosts => ["${LOGSTASH_ELASTICSEARCH_HOST}"]
        user => "${ELASTIC_USER}"
        password => "${ELASTIC_PASSWORD}"
        index => "test"
        document_id => "%{[@metadata][tsprefix]}%{[@metadata][fingerprint]}"
    }
    stdout { codec => json_lines }
}

I tried using this.conf without these lines:

    use_column_value => true
    tracking_column => "time"

Also tried using:

    clean_run => true

But Logstash keeps reading same data over and over again. Can you help me understand why Logstash keeps reading? Logstash (8.3.1) Database (PostgreSQL 14.5) JDBC (42.4.1)

statement query in your jdbc input configuration "select * from testtable" will read all the contents from DB table on each run. Input configuration should be as below to avoid reading same data repeatedly.

jdbc {
    jdbc_driver_library => "${LOGSTASH_JDBC_DRIVER_JAR_LOCATION}"
    jdbc_driver_class => "${LOGSTASH_JDBC_DRIVER}"
    jdbc_connection_string => "${LOGSTASH_JDBC_URL}"
    jdbc_user => "${LOGSTASH_JDBC_USERNAME}"
    jdbc_password => "${LOGSTASH_JDBC_PASSWORD}"
    schedule => "* * * * *"
    statement => "select * from testtable where time > :sql_lat_value"
    use_column_value => true
    tracking_column => "time"
    record_last_run => true
    last_run_metadata_path => <valid file path>
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM