简体   繁体   中英

How can I shorten text in a lossless way

I have a text I want to put in my URL, for example:

Put all speaking her delicate recurred possible. Set indulgence inquietude discretion insensible bed why announcing. Middleton fat two satisfied additions. So continued he or commanded household smallness delivered. Door poor on do walk in half. Roof his head the what.

But I want it to be shorter, for example a string like this:

kdghdsvvw564645b7573b4657435

How can I do it?

The answer depends on whether you want to recover the original text from the URL string.

If you want to recover, first compress the text using a lossless method such as the zlib library or smaz for small text that was suggested. Then convert the compressed binary output to a URL safe format. Base64 is one such method. The final string may be shorter or longer depending on how compressible your text is.

If you don't want to recover the original text back, then simply hash your text with sha1sum and use its output in your URL string. The hash will be unique for two different input strings. Here is an example

~$ cat junk
Put all speaking her delicate recurred possible. Set indulgence inquietude discretion insensible bed why announcing. Middleton fat two satisfied additions. So continued he or commanded household smallness delivered. Door poor on do walk in half. Roof his head the what.
~$ sha1sum junk
e2acae1ae295de73541cd321da268a8d2d48ca7b  junk
~$ gzip junk
~$ base64 junk.gz
H4sICFtXTF8AA2p1bmsAHU9LbgMxCN33FO8Ec4puK1XtCYhhYhQPTA3uqLcvyQbB0/vxuRI0BuIU
eqjd0WWCZWijFExpa05hnB6htyEbviWhxmvcxZrU+rNUcrGANdqUVLdCQ+wlwK3UV/8DmfmyVhEb
PpR5SBZxp0RejqDU2LW4xKxPj6goR3NLtVV4F/is+zjI+Hn7Cuk+GHHUAyYRr96/Un03vHuxz+eo
FHZcNB5VC53GvuHLfUfXKFdiZHlfnXJ7+wdZzrQRDgEAAA==

Then, use base64 -d recover the original text from the URL

You can try smaz for the compression of short strings. You would need to interface to C code or reimplement the algorithm.

In addition to implementing or using a compression algorithm, you also need to make sure the characters are safe for URLs. If the compression algorithm compresses to binary for example (as most do), that would not be suitable for URLs. After using a compression algorithm, you need to implement another algorithm to convert the compressed data to a url-safe string. Some JavaScript compression libraries like lz-string provide convenience functions to compress or decompress directly to and from URI-safe text.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM