[英]Nodejs error encoding when get external site's content
我使用request
模块的get
方法来获取外部站点的内容。 如果外部站点的编码为utf-8,则可以,但是与其他编码(如shift-jis)一起显示错误
function getExternalUrl(request, response, url){
mod_request.get(url, function (err, res, body) {
//mod_request.get({uri: url, encoding: 'binary'}, function (err, res, body) {
if (err){
console.log("\terr=" + err);
}else{
var result = res.body;
// Process res.body
response.write(result);
}
response.end();
});
}
如何获得具有正确编码的外部站点的内容?
我找到了方法:
使用binary
编码获取
var mod_request = require('request');
mod_request.get({uri:url,编码:'binary',headers:headers},function(err,res,body){});
创建binary
格式的Buffer
var contentBuffer = new Buffer(res.body,'binary');
通过detect-character-encoding
npm获取页面的真实编码
var mod_detect_character_encoding = require('detect-character-encoding');
var charsetMatch = mod_detect_character_encoding(contentBuffer);
通过iconv
npm将页面转换为utf-8
var mod_iconv = require('iconv')。Iconv;
var iconv = new mod_iconv(charsetMatch.encoding,'utf-8');
var result = iconv.convert(contentBuffer).toString();
P / S: This way is only applied for text file (html, css, js). Please do not apply for image file or others which is not text
This way is only applied for text file (html, css, js). Please do not apply for image file or others which is not text
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.