简体   繁体   中英

Compressing a string of 1's and 0s containing the same number of 1's as 0's

I have a string of 1's and 0's in which the number of 1's and 0's is the same. I would like to compress this into a number that is smaller in terms of the number of bits needed to store it. Also, converting between the compressed form and non compressed form needs to not require a lot of work.

For example, ordering all possible strings and numbering them off and letting this number be the compressed data would be too much work.

An easy solution would be to allow the compressed data to be just the first n-1 characters of the string where the string is of length n. Converting between the compressed and decompressed data would be easy but this offers little compression, only one bit per string.

I would like an algorithm that would compress a string with this property (same number of ones and zeros) that can be generalized to a string with any even length. I would also like it to compress more than the method described above.

Thanks for help.

This is a combination problem, N items taken k at a time.

In your comment as an example of Length 10, taken 5 at a time, means that there are only 252 unique patterns. Which can fit into an 8 bit value, instead of a 10 bit value. SEE: WIKI: Combinations

Expanding the indexed value from the 0-251 , there are examples here:

SEE: Algorithm to return all combinations of k elements from n

While extracting, you can use the extracted value to set the Bit position in the reconstructed value, which is O(1) time per expansion. If the list is not millions+ you could pre-compute a lookup table, which is much faster to translate the index value to the decoded value. IE: build a list of all possible, and lookup the translation.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM