[英]Most efficient way to store large arrays of integers in localStorage with Javascript
*"Efficient" here basically means in terms of smaller size (to reduce the IO waiting time), and speedy retrieval/deserialization times. *“高效”这里基本上意味着更小的尺寸(减少IO等待时间),以及快速的检索/反序列化时间。 Storing times are not as important. 存储时间并不重要。
I have to store a couple of dozen arrays of integers, each with 1800 values in the range 0-50, in the browser's localStorage -- that is, as a string. 我必须在浏览器的localStorage中存储几十个整数数组,每个数组都有一个范围为0-50的1800个值 - 也就是说,作为一个字符串。
Obviously, the simplest method is to just JSON.stringify
it, however, that adds a lot of unnecessary information, considering that the ranges of the data is well known. 显然,最简单的方法是JSON.stringify
它,但是,考虑到数据的范围是众所周知的,这会增加许多不必要的信息。 An average size for one of these arrays is then ~5500 bytes. 其中一个阵列的平均大小为~5500字节。
Here are some other methods I've tried (resultant size, and time to deserialize it 1000 times at the end) 以下是我尝试的其他一些方法(最终的大小,以及最后反序列化1000次的时间)
zero-padding the numbers so each was 2 characters long, eg: 对数字进行零填充,使每个长度为2个字符,例如:
[5, 27, 7, 38] ==> "05270738"
base 50 encoding it: base 50编码:
[5, 11, 7, 38] ==> "5b7C"
just using the value as a character code (adding 32 to avoid the weird control characters at the start): 只使用该值作为字符代码(添加32以避免开始时奇怪的控制字符):
[5, 11, 7, 38] ==> "%+'F" (String.fromCharCode(37), String.fromCharCode(43) ...)
Here are my results: 这是我的结果:
size Chrome 18 Firefox 11
-------------------------------------------------
JSON.stringify 5286 60ms 99ms
zero-padded 3600 354ms 703ms
base 50 1800 315ms 400ms
charCodes 1800 21ms 178ms
My question is if there is an even better method I haven't yet considered? 我的问题是,如果有一个更好的方法,我还没有考虑过?
Update 更新
MДΓΓБДLL suggested using compression on the data. MДΓΓБДLL建议对数据使用压缩。 Combining this LZW implementation with the base 50 and charCode data. 将此LZW实现与基础50和charCode数据相结合 。 I also tested aroth's code (packing 4 integers into 3 bytes). 我还测试了aroth的代码(将4个整数打包成3个字节)。 I got these results: 我得到了这些结果:
size Chrome 18 Firefox 11
-------------------------------------------------
LZW base 50 1103 494ms 999ms
LZW charCodes 1103 194ms 882ms
bitpacking 1350 2395ms 331ms
If your range is 0-50, then you can pack 4 numbers into 3 bytes (6 bits per number). 如果您的范围是0-50,那么您可以将4个数字打包成3个字节(每个数字6位)。 This would allow you to store 1800 numbers using ~1350 bytes. 这将允许您使用~1350字节存储1800个数字。 This code should do it: 这段代码应该这样做:
window._firstChar = 48;
window.decodeArray = function(encodedText) {
var result = [];
var temp = [];
for (var index = 0; index < encodedText.length; index += 3) {
//skipping bounds checking because the encoded text is assumed to be valid
var firstChar = encodedText.charAt(index).charCodeAt() - _firstChar;
var secondChar = encodedText.charAt(index + 1).charCodeAt() - _firstChar;
var thirdChar = encodedText.charAt(index + 2).charCodeAt() - _firstChar;
temp.push((firstChar >> 2) & 0x3F); //6 bits, 'a'
temp.push(((firstChar & 0x03) << 4) | ((secondChar >> 4) & 0xF)); //2 bits + 4 bits, 'b'
temp.push(((secondChar & 0x0F) << 2) | ((thirdChar >> 6) & 0x3)); //4 bits + 2 bits, 'c'
temp.push(thirdChar & 0x3F); //6 bits, 'd'
}
//filter out 'padding' numbers, if present; this is an extremely inefficient way to do it
for (var index = 0; index < temp.length; index++) {
if(temp[index] != 63) {
result.push(temp[index]);
}
}
return result;
};
window.encodeArray = function(array) {
var encodedData = [];
for (var index = 0; index < dataSet.length; index += 4) {
var num1 = dataSet[index];
var num2 = index + 1 < dataSet.length ? dataSet[index + 1] : 63;
var num3 = index + 2 < dataSet.length ? dataSet[index + 2] : 63;
var num4 = index + 3 < dataSet.length ? dataSet[index + 3] : 63;
encodeSet(num1, num2, num3, num4, encodedData);
}
return encodedData;
};
window.encodeSet = function(a, b, c, d, outArray) {
//we can encode 4 numbers in 3 bytes
var firstChar = ((a & 0x3F) << 2) | ((b >> 4) & 0x03); //6 bits for 'a', 2 from 'b'
var secondChar = ((b & 0x0F) << 4) | ((c >> 2) & 0x0F); //remaining 4 bits from 'b', 4 from 'c'
var thirdChar = ((c & 0x03) << 6) | (d & 0x3F); //remaining 2 bits from 'c', 6 bits for 'd'
//add _firstChar so that all values map to a printable character
outArray.push(String.fromCharCode(firstChar + _firstChar));
outArray.push(String.fromCharCode(secondChar + _firstChar));
outArray.push(String.fromCharCode(thirdChar + _firstChar));
};
Here's a quick example: http://jsfiddle.net/NWyBx/1 这是一个简单的例子: http : //jsfiddle.net/NWyBx/1
Note that storage size can likely be further reduced by applying gzip compression to the resulting string. 请注意,通过对结果字符串应用gzip压缩,可以进一步减小存储大小。
Alternately, if the ordering of your numbers is not significant, then you can simply do a bucket-sort using 51 buckets (assuming 0-50 includes both 0 and 50 as valid numbers) and store the counts for each bucket instead of the numbers themselves. 或者,如果您的数字的排序不重要,那么您可以使用51个桶进行桶式排序(假设0-50包括0和50作为有效数字)并存储每个桶的计数而不是数字本身。 That would likely give you better compression and efficiency than any other approach. 这可能会比任何其他方法更好地提供压缩和效率。
Assuming (as in your test) that compression takes more time than the size reduction saves you, your char encoding is the smallest you'll get without bitshifting. 假设(在你的测试中)压缩花费的时间比减小尺寸所节省的时间多,你的char编码是没有位移的最小值。 You're currently using one byte for each number, but if they're guaranteed to be small enough you could put two numbers in each byte. 您当前正在为每个数字使用一个字节,但如果它们保证足够小,则可以在每个字节中放置两个数字。 That would probably be an over-optimization, unless this is a very hot piece of your code. 这可能是一种过度优化,除非这是一段非常热门的代码。
You might want to consider using Uint8Array
or ArrayBuffer
. 您可能需要考虑使用Uint8Array
或ArrayBuffer
。 This blogpost shows how it's done. 这篇博文显示了它是如何完成的。 Copying his logic, here's an example, assuming you have an existing Uint8Array
named arr
. 复制他的逻辑,这是一个例子,假设你有一个名为arr
的现有Uint8Array
。
function arrayBufferToBinaryString(buffer, cb) {
var blobBuilder = new BlobBuilder();
blobBuilder.append(buffer);
var blob = blobBuilder.getBlob();
var reader = new FileReader();
reader.onload = function (e) {
cb(reader.result);
};
reader.readAsBinaryString(blob);
}
arrayBufferToBinaryString(arr.buffer, function(s) {
// do something with s
});
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.