简体   繁体   中英

MYSQL to Elasticsearch via Logstash Problem: incompatible encodings: CP850 and UTF-8

I am using an elk stack via docker-compose with ES version 8.4.0

My goal is to use Logstash to copy an entire table from my MYSQL DB to ES. The connection works and Logstash copies about 30 entries with no problems. But then I get a long error message.

[2022-09-10T18:41:26,318][ERROR][logstash.outputs.elasticsearch][main][757e3825fce0788f949869472d03e028630de9d063200717b56bc9ceefe29d81] An unknown error occurred sending a bulk request to Elasticsearch (will retry indefinitely) {:message=>"incompatible encodings: CP850 and UTF-8", :exception=>Encoding::CompatibilityError, :backtrace=>["org/jruby/ext/stringio/StringIO.java:1162:in write'", "D:/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-output-elasticsearch-11.6.0-java/lib/logstash/outputs/elasticsearch/http_client.rb:142:in block in bulk'", "org/jruby/RubyArray.java:1865:in each'", "org/jruby/RubyEnumerable.java:1143:in each_with_index'", "D:/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-output-elasticsearch-11.6.0-java/lib/logstash/outputs/elasticsearch/http_client.rb:125:in bulk'", "D:/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-output-elasticsearch-11.6.0-java/lib/logstash/plugin_mixins/elasticsearch/common.rb:296:in bulk'", "D:/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-output-elasticsearch-11.6.0-java/lib/logstash/plugin_mixins/elasticsearch/common.rb:296:in safe_bulk'", "D:/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-output-elasticsearch-11.6.0-java/lib/logstash/plugin_mixins/elasticsearch/common.rb:228:in submit'", "D:/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-output-elasticsearch-11.6.0-java/lib/logstash/plugin_mixins/elasticsearch/common.rb:177:in retrying_submit'", "D:/logstash/vendor/bundle/jruby/2.6.0/gems/logstash-output-elasticsearch-11.6.0-java/lib/logstash/outputs/elasticsearch.rb:342:in multi_receive'", "org/logstash/config/ir/compiler/AbstractOutputDelegatorExt.java:121:in multi_receive'", "D:/logstash/logstash-core/lib/logstash/java_pipeline.rb:300:in `block in start_workers'"]}

I suspect this error is the reason: {:message=>"incompatible encodings: CP850 and UTF-8", :exception=>Encoding::CompatibilityError

My config file look like that:

  jdbc {
    clean_run => true
    jdbc_driver_library => "D:\logstash\mysql-connector-java-8.0.30.jar" 
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    jdbc_connection_string => "jdbc:mysql://localhost:3306/posts" 
    jdbc_user => "sqluser"
    jdbc_password => "sqlpassword"
    schedule => "* * * * *" 
    statement => "SELECT id, id_post, url, id_subforum, author, text, spread, date, added 
    FROM telegram.channel_results where id >:sql_last_value;"
    use_column_value => true
    tracking_column => "id"
    
  }
}

output {
  elasticsearch {
    hosts => ["localhost:9200"]
    index => "posts"
    user => "username"
    password => "password"
  }

  stdout {
    codec => rubydebug
  }
}

I notice that if I removed the text column from the query, the process runs without any problems. In my database, the text column is of SQL type text. I suspect an encoding problem because there are also Russian texts and emotes included. I need a solution to also copy the texts in ES. Maybe it is a encoding problem with emotes and other characters in the text?!

Try this below charset encoding input filter.

 jdbc {
    clean_run => true
    jdbc_driver_library => "D:\logstash\mysql-connector-java-8.0.30.jar" 
    jdbc_driver_class => "com.mysql.jdbc.Driver"
    jdbc_connection_string => "jdbc:mysql://localhost:3306/posts" 
    jdbc_user => "sqluser"
    jdbc_password => "sqlpassword"
    schedule => "* * * * *" 
    statement => "SELECT id, id_post, url, id_subforum, author, text, spread, date, added 
    FROM telegram.channel_results where id >:sql_last_value;"
    use_column_value => true
    tracking_column => "id"
    columns_charset => {
            "text" => "ISO-8859-5"
    }
  }

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM