简体   繁体   English

压缩文本以存储在mysql数据库中,是否必须为base64

[英]Compress text to store in mysql database, does it have to be base64

I got a code from here that defines a compressed text field. 我从这里得到了一个定义压缩文本字段的代码。 I need to do this because I'm storing too much text and my database is too big. 我需要这样做,因为我存储了太多的文本,并且数据库太大。 The problem is that the code doesn't have any documentation and it's confusing. 问题在于该代码没有任何文档,并且令人困惑。

Particularly, I have modified the code a little, in here: 特别是,我在这里做了一些修改:

def get_prep_value(self,value):
    if not value:
        return value
    try:
        tmp = value.encode('utf-8').encode('bz2')
    except Exception:
        return value
    else:
        if len(tmp) > len(value):
            return value
        return tmp

In the original code, they encode to base64 after bz2, which it shows not to optimize but I was wondering if there might be another reason to do that? 在原始代码中,它们在bz2之后编码为base64 ,这表明并没有进行优化,但是我想知道是否还有其他原因可以这样做? btw. 顺便说一句 I'm using MySql back-end 我正在使用MySql后端

I also removed lines 11-15 that didn't make sense to me.. why would you decode in here? 我还删除了对我来说没有意义的11-15行。为什么您要在此处解码?

Base64-encoding the data guarantees that the resulting data will be safe to insert into a text-only column (while sacrificing some of the compression that bzip2 offers). 对数据进行Base64编码可确保将生成的数据安全地插入到纯文本列中(同时牺牲了bzip2提供的某些压缩)。 The author must have had a requirement to insert the data into a text column. 作者必须具有将数据插入文本列的要求。 If you're using a BLOB type of column, you don't need to worry about the base64 part (and you'll get more compression). 如果您使用的是BLOB类型的列,则无需担心base64部分(并且您将获得更多的压缩)。

The linked example seems a bit roundabout in light of the fact that MySQL supports gzip compression natively. 鉴于MySQL本机支持gzip压缩,因此链接的示例似乎有些round回。 See the MySQL documentation regarding compression and encryption functions , particularly COMPRESS() and UNCOMPRESS() . 请参阅有关压缩和加密功能的MySQL文档 ,尤其是COMPRESS()UNCOMPRESS() If you have BLOB columns which can store binary data, these will happily store your compressed data. 如果您具有可以存储二进制数据的BLOB列,则这些列将很高兴地存储您的压缩数据。

The downside to this approach is that the uncompressed data needs to make a trip to the server where it is compressed (or uncompressed before shipping over the network back to the client). 这种方法的缺点是,未压缩的数据需要传送到服务器(该服务器在其上进行了压缩)(或在通过网络传送回客户端之前未压缩)。 This might have provided the motivation behind the author's original snippet. 这可能提供了作者原始摘要的动机。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM