简体   繁体   中英

JSON encoding issue attempting to load bulk data into ElasticSearch, only from Windows

I'm uploading a file to the ElasticSearch /_bulk api to insert/update records. From my local machine (OSX) I've had no problems and continue to be able to send the "problem" data without issue.

From our QA server, which is running Windows Server 2012, ES returns an error for a row that contains a name with a diacritic (accent).

The data is similar to this (changed the name but left the accent): María

The error returned is:

MapperParsingException[failed to parse [name.display]]; nested: JsonParseException[Invalid UTF-8 middle byte 0x61 at [Source: [B@466e94e8; line: 1, column: 194]];

Based on some other stack overflow answers , I'm currently of the opinion that it's some sort of encoding issue.

I'm uploading the file using Adobe ColdFusion 11, with the following code:

cfhttp( method=arguments.method, url=arguments.uri, result="result" ) {
    cfhttpparam( type="body", value="#fileReadBinary( file )#" );
}

Since I suspect an encoding issue, I also added a header to try and force the encoding it to UTF-8 , like so:

cfhttp( method=arguments.method, url=arguments.uri, result="result" ) {
    cfhttpparam( type="header", name="Content-Type", value="application/javascript; charset=UTF-8" );
    cfhttpparam( type="body", value="#fileReadBinary( file )#" );
}

No matter what I try, I continue to get the same error message. I'm not sure where to go from here.

After enough noodling around I remembered that the function charsetEncode() might be of some use.

I tested this on both Windows and OSX to make sure that the windows fix didn't break functionality on OSX, and so far it works perfectly in both locations:

cfhttp( method=arguments.method, url=arguments.uri, result="result" ) {
    cfhttpparam( type="header", name="Content-Type", value="application/javascript; charset=UTF-8" );
    cfhttpparam( type="body", value="#charsetEncode(fileReadBinary( file ), 'utf-8')#" );
}

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM