簡體   English   中英

如何壓縮文本文件

[英]How to compress text file

無論如何,有沒有壓縮此代碼中使用的文本。 我會很感激的。
嘿,總有沒有要壓縮此代碼中使用的文本。 我會很感激的。

 file = open("Test.txt", "r")

 Sentence = (file.read())

 s = Sentence.split(" ")

 ListSentence = []
 uniquewords = []
 print(Sentence)
 for x in s:
     if x in uniquewords:
         ListSentence.append(uniquewords.index(x))
     else:
         uniquewords.append(x)
         ListSentence.append(uniquewords.index(x))
 print(ListSentence)

 recreated = ""
 for position in ListSentence:
    recreated = recreated + uniquewords[position] + " "
 print(uniquewords)
 print (recreated)

問題有點含糊...如果您指的是數據壓縮,則可以使用二進制轉換。

In [1]: import codecs

In [2]: example = 'abcdefg'*100

In [3]: compressed = codecs.encode(example.encode(), 'zlib')

In [4]: compressed
Out[4]: b'x\x9cKLJNIMKO\x1c\xa5F\xa9\xa1F\x01\x00m\x8e\x11\x80'

In [5]: decompressed = codecs.decode(compressed, 'zlib')

In [6]: decompressed
Out[6]: b'abcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefgabcdefg'

查看文檔中的編解碼器,底部是為二進制轉換提供的內置編解碼器。

但是,如果您要壓縮以表達減少代碼行的願望,那么雖然您的代碼意圖含糊不清,但我想您想過濾掉重復的單詞,同時可能保留單詞的順序...

沒有命令:

' '.join(set(sentence.split()))

有訂單:

seen = set()
words = sentence.split()
new = []
for word in words:
    if word not in seen:
        seen.add(word)
        new.append(word)
unique_ordered = ' '.join(new)

似乎您在詢問是否可以減少所擁有的代碼行。 這是我的嘗試:

 with open("Test.txt", "r") as file:
     Sentence = file.read().split(" ")
 ListSentence, uniquewords = [], []
 print(Sentence)
 for x in s:
     if x not in uniquewords:
         uniquewords.append(x)
     ListSentence.append(uniquewords.index(x)) # you do this every loop anyway
 print(ListSentence)

 recreated = ""
 for position in ListSentence:
    recreated += uniquewords[position] + " "
 print(uniquewords)
 print(recreated)

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM