简体   繁体   English

使用请求进行 nodejs 编码

[英]nodejs encoding using request

I am trying to get the correct encoding with request.我正在尝试通过请求获得正确的编码。

request.get({
    "uri":'http://www.bold.dk/tv/',
    "encoding": "text/html;charset='charset=utf-8'"
  },
  function(err, resp, body){    
    console.log(body);
  }
);

No matter what I do the encoding of the danish chars are not right.无论我做什么,丹麦字符的编码都不正确。

Any thoughts?有什么想法吗?

You can use iconv (lite) to convert this.您可以使用 iconv (lite) 来转换它。 You also need to tell request not to actively set the encoding to the default of UTF-8 by setting the encoding property to null.您还需要通过将 encoding 属性设置为 null 来告诉 request 不要主动将编码设置为默认的 UTF-8。 Therefore you should do:因此你应该这样做:

var iconv = require('iconv-lite');
request.get({
    uri:'http://www.bold.dk/tv/',
    encoding: null
  },
  function(err, resp, body){    
    var bodyWithCorrectEncoding = iconv.decode(body, 'iso-8859-1');
    console.log(bodyWithCorrectEncoding);
  }
);

Maybe your trouble is in 'Accept-Encoding' header.也许您的问题出在'Accept-Encoding'标头中。 Let's say you have Headers like 'Accept-Encoding': 'gzip,deflate'假设你有像'Accept-Encoding': 'gzip,deflate'这样'Accept-Encoding': 'gzip,deflate'标题

If it's so, you have 2 ways to fixing this:如果是这样,您有两种方法可以解决此问题:

  1. Remove this Header删除此标题
  2. Use the following code to unzip the data:使用以下代码解压数据:

     const req = request(options, res => { let buffers = [] let bufferLength = 0 let strings = [] const getData = chunk => { if (!Buffer.isBuffer(chunk)) { strings.push(chunk) } else if (chunk.length) { bufferLength += chunk.length buffers.push(chunk) } } const endData = () => { let response = {code: 200, body: ''} if (bufferLength) { response.body = Buffer.concat(buffers, bufferLength) if (options.encoding !== null) { response.body = response.body.toString(options.encoding) } buffers = [] bufferLength = 0 } else if (strings.length) { if (options.encoding === 'utf8' && strings[0].length > 0 && strings[0][0] === '\') { strings[0] = strings[0].substring(1) } response.body = strings.join('') } console.log('response', response) }; switch (res.headers['content-encoding']) { // or, just use zlib.createUnzip() to handle both cases case 'gzip': res.pipe(zlib.createGunzip()) .on('data', getData) .on('end', endData) break; case 'deflate': res.pipe(zlib.createInflate()) .on('data', getData) .on('end', endData) break; default: res.pipe(zlib.createInflate()) .on('data', getData) .on('end', endData) break; } });

I have the same problem, with request v2.88.0 .我有同样的问题, request v2.88.0

Refer to woolfi makkinan's answer, I got a simple way to solve the problem.参考woolfi makkinan 的回答,我找到了解决问题的简单方法。

request.get({
    "uri": 'http://www.bold.dk/tv/',
    "encoding": "text/html;charset='charset=utf-8'",
    "gzip": true // notice this config
  },
  function(err, resp, body){    
    console.log(body);
  }
);

Add gzip: true to request options, request will deal with gzip, and then blob can convert to string correctly.request选项添加gzip: truerequest会处理gzip,然后blob就可以正确的转成字符串了。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM