[英]Serializing Array of Many Duplicates
So I have a series of arrays, each of which are 2500 long, and I need to serialize and store all them in very limited space. 所以我有一系列数组,每个数组都是2500长,我需要在非常有限的空间内序列化和存储它们。
Since I have many duplicates, I wanted to cut them down to something like below. 由于我有很多重复,我想把它们切成下面的东西。
[0,0,0,0,2,7,3,3,0,0,0,0,0,0,0,0,0]
// to
[0x4,2,7,3x2,0x9]
I wrote a couple one-liners (utilising Lodash' _.repeat
) to convert to and from this pattern, however converting to doesn't seem to work in most/all cases. 我写了几个单行(利用Lodash' _.repeat
)来转换为这种模式,然而转换为大多数/所有情况似乎都不起作用。
let serialized = array.toString().replace(/((?:(\d)+,?)((?:\2+,?){2,}))/g, (m, p1, p2) => p2 + 'x' + m.replace(/,/g, '').length);
let parsed = serialized.replace(/(\d+)x(\d+),?/g, (z, p1, p2) => _.repeat(p1 + ',', +p2)).split(',');
I don't know why it doesn't work. 我不知道为什么它不起作用。 It may be due to some of the numbers in the array. 这可能是由于数组中的一些数字。 Eye-balling, the largest one is 4294967295
, however well over 90% is just 0
. 眼球,最大的是4294967295
,但超过90%只是0
。
What am I missing in my RegEx that's preventing it from working correctly? 我的RegEx中缺少什么阻止它正常工作? Is there a simpler way that I'm too blind to see? 是否有一种更简单的方式让我看不清楚?
I'm fairly confident with converting it back from the serialized state, just need a hand getting it to the state. 我很有信心将它从序列化状态转换回来,只需要一只手就可以将它带到状态。
Straight forward and simple serialization: 直接简单的序列化:
let serialize = arr => { const elements = []; const counts = [] let last = undefined; [0,0,0,0,2,7,3,3,0,0,0,0,0,0,0,0,0].forEach((el,i,arr)=>{ if (el!==last) { elements.push(el); counts.push(1); } else { counts[counts.length-1]++; } last = el; }) return elements.map((a,i)=>counts[i]>1?`${a}x${counts[i]}`:a).join(","); }; console.log(serialize([0,0,0,0,2,7,3,3,0,0,0,0,0,0,0,0,0]));
UPDATE UPDATE
Pure functional serialize one: 纯功能序列化一:
let serialize = arr => arr .reduce((memo, element, i) => { if (element !== arr[i - 1]) { memo.push({count: 1, element}); } else { memo[memo.length - 1].count++; } return memo; },[]) .map(({count, element}) => count > 1 ? `${count}x${element}` : element) .join(","); console.log(serialize([0,0,0,0,2,7,3,3,0,0,0,0,0,0,0,0,0]));
Pure functional deserialize : 纯函数反序列化 :
const deserialize = str => str .split(",") .map(c => c.split("x").reverse()) .reduce((memo, [el, count = 1]) => memo.concat(Array(+count).fill(+el)), []); console.log(deserialize("4x0,2,7,2x3,9x0"))
In order to avoid using .reverse()
in this logic, I'd recommend to change serialization from 4x0
to 0x4
为了避免在此逻辑中使用.reverse()
,我建议将序列化从4x0
更改为0x4
Try this 尝试这个
var arr = [0,0,0,0,2,7,3,3,0,0,0,0,0,0,0,0,0]; var finalArray = []; //array into which count of values will go var currentValue = ""; //current value for comparison var tmpArr = []; //temporary array to hold values arr.forEach( function( val, index ){ if ( val != currentValue && currentValue !== "" ) { finalArray.push( tmpArr.length + "x" + tmpArr[0] ); tmpArr = []; } tmpArr.push(val); currentValue = val; }); finalArray.push( tmpArr.length + "x" + tmpArr[0] ); console.log(finalArray);
Another version without temporary array 另一个版本没有临时数组
var arr = [0, 0, 0, 0, 2, 7, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0]; var finalArray = []; //array into which count of values will go var tmpCount = 0; //temporary variable to hold count arr.forEach(function(val, index) { if ( (val != arr[ index - 1 ] && index !== 0 ) ) { finalArray.push(tmpCount + "x" + arr[ index - 1 ] ); tmpCount = 0; } tmpCount++; if ( index == arr.length - 1 ) { finalArray.push(tmpCount + "x" + arr[ index - 1 ] ); } }); console.log(finalArray);
Do not use RegEx. 不要使用RegEx。 Just use regular logic. 只需使用常规逻辑。 I recommend array.reduce
for this job. 我推荐array.reduce
来完成这项工作。
const arr1 = [0,0,0,0,2,7,3,3,0,0,0,0,0,0,0,0,0] const arr2 = ['0x4','2','7','3x2','0x9']; const compact = arr => { const info = arr.reduce((c, v) =>{ if(c.prevValue !== v){ c.order.push(v); c.count[v] = 1; c.prevCount = 1; c.prevValue = v; } else { c.prevCount = c.prevCount + 1; c.count[v] = c.count[v] + 1; }; return c; },{ prevValue: null, prevCount: 0, count: {}, order: [] }); return info.order.map(v => info.count[v] > 1 ? `${v}x${info.count[v]}` : `${v}`); } const expand = arr => { return arr.reduce((c, v) => { const split = v.split('x'); const value = +split[0]; const count = +split[1] || 1; Array.prototype.push.apply(c, Array(count).fill(value)); return c; }, []); } console.log(compact(arr1)); console.log(expand(arr2));
This is a typical reducing job. 这是一个典型的减少工作。 Here is your compress
function done in just O(n) time.. 这是你的compress
功能只需O(n)时间完成..
var arr = [0,0,0,0,2,7,3,3,0,0,0,0,0,0,0,0,0], compress = a => a.reduce((r,e,i,a) => e === a[i-1] ? (r[r.length-1][1]++,r) : (r.push([e,1]) ,r),[]); console.log(JSON.stringify(compress(arr)));
因为这里的动机是减少存储数组的大小,所以考虑使用类似gzip-js来压缩数据。
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.