简体   繁体   English

使用Javascript在localStorage中存储大型整数数组的最有效方法

[英]Most efficient way to store large arrays of integers in localStorage with Javascript

*"Efficient" here basically means in terms of smaller size (to reduce the IO waiting time), and speedy retrieval/deserialization times. *“高效”这里基本上意味着更小的尺寸(减少IO等待时间),以及快速的检索/反序列化时间。 Storing times are not as important. 存储时间并不重要。

I have to store a couple of dozen arrays of integers, each with 1800 values in the range 0-50, in the browser's localStorage -- that is, as a string. 我必须在浏览器的localStorage中存储几十个整数数组,每个数组都有一个范围为0-50的1800个值 - 也就是说,作为一个字符串。

Obviously, the simplest method is to just JSON.stringify it, however, that adds a lot of unnecessary information, considering that the ranges of the data is well known. 显然,最简单的方法是JSON.stringify它,但是,考虑到数据的范围是众所周知的,这会增加许多不必要的信息。 An average size for one of these arrays is then ~5500 bytes. 其中一个阵列的平均大小为~5500字节。

Here are some other methods I've tried (resultant size, and time to deserialize it 1000 times at the end) 以下是我尝试的其他一些方法(最终的大小,以及最后反序列化1000次的时间)

  • zero-padding the numbers so each was 2 characters long, eg: 对数字进行零填充,使每个长度为2个字符,例如:

     [5, 27, 7, 38] ==> "05270738" 
  • base 50 encoding it: base 50编码:

     [5, 11, 7, 38] ==> "5b7C" 
  • just using the value as a character code (adding 32 to avoid the weird control characters at the start): 只使用该值作为字符代码(添加32以避免开始时奇怪的控制字符):

     [5, 11, 7, 38] ==> "%+'F" (String.fromCharCode(37), String.fromCharCode(43) ...) 

Here are my results: 这是我的结果:

                  size     Chrome 18   Firefox 11
-------------------------------------------------
JSON.stringify    5286          60ms         99ms
zero-padded       3600         354ms        703ms
base 50           1800         315ms        400ms
charCodes         1800          21ms        178ms

My question is if there is an even better method I haven't yet considered? 我的问题是,如果有一个更好的方法,我还没有考虑过?

Update 更新
MДΓΓБДLL suggested using compression on the data. MДΓΓБДLL建议对数据使用压缩。 Combining this LZW implementation with the base 50 and charCode data. 将此LZW实现与基础50和charCode数据相结合 I also tested aroth's code (packing 4 integers into 3 bytes). 我还测试了aroth的代码(将4个整数打包成3个字节)。 I got these results: 我得到了这些结果:

                  size     Chrome 18   Firefox 11
-------------------------------------------------
LZW base 50       1103         494ms        999ms
LZW charCodes     1103         194ms        882ms
bitpacking        1350        2395ms        331ms

If your range is 0-50, then you can pack 4 numbers into 3 bytes (6 bits per number). 如果您的范围是0-50,那么您可以将4个数字打包成3个字节(每个数字6位)。 This would allow you to store 1800 numbers using ~1350 bytes. 这将允许您使用~1350字节存储1800个数字。 This code should do it: 这段代码应该这样做:

window._firstChar = 48;

window.decodeArray = function(encodedText) {
    var result = [];
    var temp = [];

    for (var index = 0; index < encodedText.length; index += 3) {
        //skipping bounds checking because the encoded text is assumed to be valid
        var firstChar = encodedText.charAt(index).charCodeAt() - _firstChar;
        var secondChar = encodedText.charAt(index + 1).charCodeAt() - _firstChar;
        var thirdChar = encodedText.charAt(index + 2).charCodeAt() - _firstChar;

        temp.push((firstChar >> 2) & 0x3F);    //6 bits, 'a'
        temp.push(((firstChar & 0x03) << 4) | ((secondChar >> 4) & 0xF));  //2 bits + 4 bits, 'b'
        temp.push(((secondChar & 0x0F) << 2) | ((thirdChar >> 6) & 0x3));  //4 bits + 2 bits, 'c'
        temp.push(thirdChar & 0x3F);  //6 bits, 'd'

    }

    //filter out 'padding' numbers, if present; this is an extremely inefficient way to do it
    for (var index = 0; index < temp.length; index++) {
        if(temp[index] != 63) {
            result.push(temp[index]);
        }            
    }

    return result;
};

window.encodeArray = function(array) {
    var encodedData = [];

    for (var index = 0; index < dataSet.length; index += 4) {
        var num1 = dataSet[index];
        var num2 = index + 1 < dataSet.length ? dataSet[index + 1] : 63;
        var num3 = index + 2 < dataSet.length ? dataSet[index + 2] : 63;
        var num4 = index + 3 < dataSet.length ? dataSet[index + 3] : 63;

        encodeSet(num1, num2, num3, num4, encodedData);
    }

    return encodedData;
};

window.encodeSet = function(a, b, c, d, outArray) {
    //we can encode 4 numbers in 3 bytes
    var firstChar = ((a & 0x3F) << 2) | ((b >> 4) & 0x03);   //6 bits for 'a', 2 from 'b'
    var secondChar = ((b & 0x0F) << 4) | ((c >> 2) & 0x0F);  //remaining 4 bits from 'b', 4 from 'c'
    var thirdChar = ((c & 0x03) << 6) | (d & 0x3F);          //remaining 2 bits from 'c', 6 bits for 'd'

    //add _firstChar so that all values map to a printable character
    outArray.push(String.fromCharCode(firstChar + _firstChar));
    outArray.push(String.fromCharCode(secondChar + _firstChar));
    outArray.push(String.fromCharCode(thirdChar + _firstChar));
};

Here's a quick example: http://jsfiddle.net/NWyBx/1 这是一个简单的例子: http//jsfiddle.net/NWyBx/1

Note that storage size can likely be further reduced by applying gzip compression to the resulting string. 请注意,通过对结果字符串应用gzip压缩,可以进一步减小存储大小。

Alternately, if the ordering of your numbers is not significant, then you can simply do a bucket-sort using 51 buckets (assuming 0-50 includes both 0 and 50 as valid numbers) and store the counts for each bucket instead of the numbers themselves. 或者,如果您的数字的排序不重要,那么您可以使用51个桶进行桶式排序(假设0-50包括0和50作为有效数字)并存储每个桶的计数而不是数字本身。 That would likely give you better compression and efficiency than any other approach. 这可能会比任何其他方法更好地提供压缩和效率。

Assuming (as in your test) that compression takes more time than the size reduction saves you, your char encoding is the smallest you'll get without bitshifting. 假设(在你的测试中)压缩花费的时间比减小尺寸所节省的时间多,你的char编码是没有位移的最小值。 You're currently using one byte for each number, but if they're guaranteed to be small enough you could put two numbers in each byte. 您当前正在为每个数字使用一个字节,但如果它们保证足够小,则可以在每个字节中放置两个数字。 That would probably be an over-optimization, unless this is a very hot piece of your code. 这可能是一种过度优化,除非这是一段非常热门的代码。

You might want to consider using Uint8Array or ArrayBuffer . 您可能需要考虑使用Uint8ArrayArrayBuffer This blogpost shows how it's done. 这篇博文显示了它是如何完成的。 Copying his logic, here's an example, assuming you have an existing Uint8Array named arr . 复制他的逻辑,这是一个例子,假设你有一个名为arr的现有Uint8Array

function arrayBufferToBinaryString(buffer, cb) {
    var blobBuilder = new BlobBuilder();
    blobBuilder.append(buffer);
    var blob = blobBuilder.getBlob();
    var reader = new FileReader();
    reader.onload = function (e) {
        cb(reader.result);
    };
    reader.readAsBinaryString(blob);
}
arrayBufferToBinaryString(arr.buffer, function(s) { 
  // do something with s
});

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 Memory 在 JavaScript 中存储大量非常小的整数 arrays 的有效方法 - Memory efficient way to store a lot of very small arrays of integers in JavaScript Javascript:使用二维数组的最有效方法 - Javascript: Most Efficient Way To Use 2 Dimensional Arrays 在JavaScript中搜索数组映射的最有效方法 - Most efficient way to search a map of arrays in JavaScript Javascript:按键求和多个数组的最有效方法 - Javascript: Most Efficient Way of Summing Multiple Arrays by Key 在javascript或php中创建重复数组的最有效(紧凑)方式? - Most efficient (compact) way to create repetitive arrays in javascript or php? 比较两个字符串数组Javascript的最快/最有效的方法 - Fastest / most efficient way to compare two string arrays Javascript Javascript:在平面文件上存储/检索数据的最有效方法 - Javascript: most efficient way to store/retrieve data on flat file 在JavaScript中存储鼠标移动数据的最有效处理方式是什么? - What is the most processing efficient way to store mouse movement data in JavaScript? Javascript - 存储/使用我的大对象列表的有效方法 - Javascript - efficient way to store/work with my large object list 内存有效的方式来存储整数列表 - Memory efficient way to store list of integers
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM