简体   繁体   English

序列化多个重复的数组

[英]Serializing Array of Many Duplicates

So I have a series of arrays, each of which are 2500 long, and I need to serialize and store all them in very limited space. 所以我有一系列数组,每个数组都是2500长,我需要在非常有限的空间内序列化和存储它们。

Since I have many duplicates, I wanted to cut them down to something like below. 由于我有很多重复,我想把它们切成下面的东西。

[0,0,0,0,2,7,3,3,0,0,0,0,0,0,0,0,0]
// to
[0x4,2,7,3x2,0x9]

I wrote a couple one-liners (utilising Lodash' _.repeat ) to convert to and from this pattern, however converting to doesn't seem to work in most/all cases. 我写了几个单行(利用Lodash' _.repeat )来转换为这种模式,然而转换为大多数/所有情况似乎都不起作用。

let serialized = array.toString().replace(/((?:(\d)+,?)((?:\2+,?){2,}))/g, (m, p1, p2) => p2 + 'x' + m.replace(/,/g, '').length);

let parsed = serialized.replace(/(\d+)x(\d+),?/g, (z, p1, p2) => _.repeat(p1 + ',', +p2)).split(',');

I don't know why it doesn't work. 我不知道为什么它不起作用。 It may be due to some of the numbers in the array. 这可能是由于数组中的一些数字。 Eye-balling, the largest one is 4294967295 , however well over 90% is just 0 . 眼球,最大的是4294967295 ,但超过90%只是0

What am I missing in my RegEx that's preventing it from working correctly? 我的RegEx中缺少什么阻止它正常工作? Is there a simpler way that I'm too blind to see? 是否有一种更简单的方式让我看不清楚?

I'm fairly confident with converting it back from the serialized state, just need a hand getting it to the state. 我很有信心将它从序列化状态转换回来,只需要一只手就可以将它带到状态。

Straight forward and simple serialization: 直接简单的序列化:

 let serialize = arr => { const elements = []; const counts = [] let last = undefined; [0,0,0,0,2,7,3,3,0,0,0,0,0,0,0,0,0].forEach((el,i,arr)=>{ if (el!==last) { elements.push(el); counts.push(1); } else { counts[counts.length-1]++; } last = el; }) return elements.map((a,i)=>counts[i]>1?`${a}x${counts[i]}`:a).join(","); }; console.log(serialize([0,0,0,0,2,7,3,3,0,0,0,0,0,0,0,0,0])); 

UPDATE UPDATE

Pure functional serialize one: 纯功能序列化一:

 let serialize = arr => arr .reduce((memo, element, i) => { if (element !== arr[i - 1]) { memo.push({count: 1, element}); } else { memo[memo.length - 1].count++; } return memo; },[]) .map(({count, element}) => count > 1 ? `${count}x${element}` : element) .join(","); console.log(serialize([0,0,0,0,2,7,3,3,0,0,0,0,0,0,0,0,0])); 

Pure functional deserialize : 纯函数反序列化

 const deserialize = str => str .split(",") .map(c => c.split("x").reverse()) .reduce((memo, [el, count = 1]) => memo.concat(Array(+count).fill(+el)), []); console.log(deserialize("4x0,2,7,2x3,9x0")) 

In order to avoid using .reverse() in this logic, I'd recommend to change serialization from 4x0 to 0x4 为了避免在此逻辑中使用.reverse() ,我建议将序列化从4x0更改为0x4

Try this 尝试这个

 var arr = [0,0,0,0,2,7,3,3,0,0,0,0,0,0,0,0,0]; var finalArray = []; //array into which count of values will go var currentValue = ""; //current value for comparison var tmpArr = []; //temporary array to hold values arr.forEach( function( val, index ){ if ( val != currentValue && currentValue !== "" ) { finalArray.push( tmpArr.length + "x" + tmpArr[0] ); tmpArr = []; } tmpArr.push(val); currentValue = val; }); finalArray.push( tmpArr.length + "x" + tmpArr[0] ); console.log(finalArray); 

Another version without temporary array 另一个版本没有临时数组

 var arr = [0, 0, 0, 0, 2, 7, 3, 3, 0, 0, 0, 0, 0, 0, 0, 0, 0]; var finalArray = []; //array into which count of values will go var tmpCount = 0; //temporary variable to hold count arr.forEach(function(val, index) { if ( (val != arr[ index - 1 ] && index !== 0 ) ) { finalArray.push(tmpCount + "x" + arr[ index - 1 ] ); tmpCount = 0; } tmpCount++; if ( index == arr.length - 1 ) { finalArray.push(tmpCount + "x" + arr[ index - 1 ] ); } }); console.log(finalArray); 

Do not use RegEx. 不要使用RegEx。 Just use regular logic. 只需使用常规逻辑。 I recommend array.reduce for this job. 我推荐array.reduce来完成这项工作。

 const arr1 = [0,0,0,0,2,7,3,3,0,0,0,0,0,0,0,0,0] const arr2 = ['0x4','2','7','3x2','0x9']; const compact = arr => { const info = arr.reduce((c, v) =>{ if(c.prevValue !== v){ c.order.push(v); c.count[v] = 1; c.prevCount = 1; c.prevValue = v; } else { c.prevCount = c.prevCount + 1; c.count[v] = c.count[v] + 1; }; return c; },{ prevValue: null, prevCount: 0, count: {}, order: [] }); return info.order.map(v => info.count[v] > 1 ? `${v}x${info.count[v]}` : `${v}`); } const expand = arr => { return arr.reduce((c, v) => { const split = v.split('x'); const value = +split[0]; const count = +split[1] || 1; Array.prototype.push.apply(c, Array(count).fill(value)); return c; }, []); } console.log(compact(arr1)); console.log(expand(arr2)); 

This is a typical reducing job. 这是一个典型的减少工作。 Here is your compress function done in just O(n) time.. 这是你的compress功能只需O(n)时间完成..

 var arr = [0,0,0,0,2,7,3,3,0,0,0,0,0,0,0,0,0], compress = a => a.reduce((r,e,i,a) => e === a[i-1] ? (r[r.length-1][1]++,r) : (r.push([e,1]) ,r),[]); console.log(JSON.stringify(compress(arr))); 

因为这里的动机是减少存储数组的大小,所以考虑使用类似gzip-js来压缩数据。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM