简体   繁体   中英

How do I figure this out?

how do i get this program to compress a file into a list of words and list of positions to recreate the original file. Then to take the compressed file and recreate the full text, including punctuation and capitalisation, of the original file.

startsentence = input("Please enter a sentence: ")
sentence = (startsentence)
a = startsentence.split(" ")
dict = dict()
number = 1
positions = []
for j in a:
    if j not in dict:
        dict[j] = str(number)
        number = number + 1
    positions.append(dict[j])
print (positions)


print(positions)
f = open("postions.txt", "w") 
f.write( str(positions) + "\n"  )
f.close()

print(sentence)
f = open("words.txt", "w") 
f.write( str(startsentence) + "\n"  ) 
f.close() 

Currently you are writing out the whole startsentence and not just the unique words:

f = open("words.txt", "w") 
f.write( str(startsentence) + "\n"  ) 
f.close()

You need to write only the unique words and their index and you've already created a dictionary with those words and their index dict (BTW you really shouldn't use dict as a variable name, so I will use dct ). You just need to write them out sorted based on their value (using a with statement):

with open("words.txt", "w") as f:
    f.write(' '.join(sorted(dct, key=dct.get)) + '\n')

Assuming you have a list of positions (BTW: it is much easier to start from 0 than 1) and a list of words then restoration is simple:

with open('positions.txt') as pf, open('words.txt' as wf:
    positions = [int(p) for p in pf.read().split()]  
    words = wf.read().strip().split()

recovered = ' '.join(words[p] for p in positions) # p-1 if you start from 1

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM