简体   繁体   English

压缩包含与0相同的1的1和0的字符串

[英]Compressing a string of 1's and 0s containing the same number of 1's as 0's

I have a string of 1's and 0's in which the number of 1's and 0's is the same. 我有一个1和0的字符串,其中1和0的数字是相同的。 I would like to compress this into a number that is smaller in terms of the number of bits needed to store it. 我想将它压缩成一个数字,该数字在存储它所需的位数方面较小。 Also, converting between the compressed form and non compressed form needs to not require a lot of work. 此外,在压缩形式和非压缩形式之间进行转换不需要大量工作。

For example, ordering all possible strings and numbering them off and letting this number be the compressed data would be too much work. 例如,对所有可能的字符串进行排序并将它们编号并将该数字作为压缩数据将会起到太多作用。

An easy solution would be to allow the compressed data to be just the first n-1 characters of the string where the string is of length n. 一个简单的解决方案是允许压缩数据只是字符串长度为n的字符串的前n-1个字符。 Converting between the compressed and decompressed data would be easy but this offers little compression, only one bit per string. 压缩和解压缩数据之间的转换很容易,但这提供了很少的压缩,每串只有一位。

I would like an algorithm that would compress a string with this property (same number of ones and zeros) that can be generalized to a string with any even length. 我想要一个算法,用这个属性压缩一个字符串(相同数量的1和0),可以推广到任何偶数长度的字符串。 I would also like it to compress more than the method described above. 我还希望它压缩比上述方法更多。

Thanks for help. 感谢帮助。

This is a combination problem, N items taken k at a time. 这是一个组合问题,一次取k个N项。

In your comment as an example of Length 10, taken 5 at a time, means that there are only 252 unique patterns. 在您的评论中,作为长度10的示例,一次取5,意味着只有252个独特的模式。 Which can fit into an 8 bit value, instead of a 10 bit value. 哪个可以适合8位值,而不是10位值。 SEE: WIKI: Combinations 见: WIKI:组合

Expanding the indexed value from the 0-251 , there are examples here: 从0-251扩展索引值,这里有一些例子:

SEE: Algorithm to return all combinations of k elements from n SEE: 从n返回k个元素的所有组合的算法

While extracting, you can use the extracted value to set the Bit position in the reconstructed value, which is O(1) time per expansion. 在提取时,您可以使用提取的值来设置重建值中的位位置,即每次扩展的O(1)时间。 If the list is not millions+ you could pre-compute a lookup table, which is much faster to translate the index value to the decoded value. 如果列表不是数百万+,则可以预先计算查找表,将索引值转换为解码值要快得多。 IE: build a list of all possible, and lookup the translation. IE:构建所有可能的列表,并查找翻译。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM