简体   繁体   中英

Converting a larger byte array to a string

When N is set to 125K the following works

let N = 125000
let x = [...Array(N)].map(( xx,i) => i)
let y = String.fromCodePoint(...x)
console.log(y.length)

When N is set to 128K that same code breaks:

Uncaught RangeError: Maximum call stack size exceeded

This is a common operation: what is the optimal way to achieve the conversion?

Note that I did look at this related Q&A. https://stackoverflow.com/a/3195961/1056563 We should not depend on node.js and also the approaches with the fromCharCode.apply are failing. Finally that answer is nearly ten years old.

So what is an up to date performant way to handle this conversion?

The problem is caused because implementations have limits to the number of parameters accepted . This results in an exception being raised when too many parameters (over ~128k in this case) are supplied to the String.fromCodePoint functions via the spread operator.

One way to solve this problem relatively efficiently , albeit with slightly more code, is to batch the operation across multiple calls. Here is my proposed implementation, which fixes what I perceive as issues relating to scaling performanceand the handling of surrogate pairs(that's incorrect: fromCodePoint doesn't care about surrogates, making it preferable to fromCharCode in such cases).

let N = 500 * 1000;
let A = [...Array(N)].map((x,i) => i); // start with "an array".

function codePointsToString(cps) {
  let rs = [];
  let batch = 32767; // Supported 'all' browsers
  for (let i = 0; i < cps.length; ){
    let e = i + batch;
    // Build batch section, defer to Array.join.
    rs.push(String.fromCodePoint.apply(null, cps.slice(i, e)));
    i = e;
  }
  return rs.join('');
}

var result = codePointsToString(A);
console.log(result.length);

Also, I wanted a trophy. The code above should run in O(n) time and minimize the amount of objects allocated. No guarantees on this being the 'best' approach. A benefit of the batching approach, and why the cost of apply (or spread invocation) is subsumed, is that there are significantly less calls to String.fromCodePoint and intermediate strings. YMMV - especially across environments.

Here is an online benchmark . All tests have access to, and use, the same generated "A" array of 500k elements.

在此处输入图像描述

The given answers are of poor performance: i measured 19 seconds on one of them and the others are similar (*). It is necessary to preallocate the output array. The following is 20 to 40 milli seconds. Three orders of magnitude faster.

function wordArrayToByteArray(hash) {
    var result = [...Array(hash.sigBytes)].map(x => -1)
    let words = hash.words
        //map each word to an array of bytes
        .map(function (v) {
            // create an array of 4 bytes (less if sigBytes says we have run out)
            var bytes = [0, 0, 0, 0].slice(0, Math.min(4, hash.sigBytes))
                // grab that section of the 4 byte word
                .map(function (d, i) {
                    return (v >>> (8 * i)) % 256;
                })
                // flip that
                .reverse()
            ;
            // remove the bytes we've processed
            // from the bytes we need to process
            hash.sigBytes -= bytes.length;
            return bytes;
        })
    words.forEach((w,i) => {
        result.splice(i * 4, 4, ...w)
    })
    result = result.map(function (d) {
        return String.fromCharCode(d);
    }).join('')
    return result
}

(*) With the possible exception of @User2864740 - we are awaiting his numbers. But his solution also uses apply() inside the loop which leads to believe it will also be slow.

"Old fashion" JavaScript:

var N=125000;
var y="";
for(var i=0; i<N; i++)
  y+=String.fromCharCode(i);
console.log(y.length);

Worked with N=1000000

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM