简体   繁体   English

python中的运行长度编码

[英]Run Length Encoding in python

i got homework to do "Run Length Encoding" in python and i wrote a code but it is print somthing else that i dont want.我有作业要在 python 中做“运行长度编码”,我写了一个代码,但它打印了我不想要的其他东西。 it prints just the string(just like he was written) but i want that it prints the string and if threre are any characthers more than one time in this string it will print the character just one time and near it the number of time that she appeard in the string.它只打印字符串(就像他写的那样)但我希望它打印字符串,如果在这个字符串中有超过一次的字符,它将只打印一次字符并且接近它的次数她出现在字符串中。 how can i do this?我怎样才能做到这一点?

For example:例如:

the string : 'lelamfaf"字符串:'lelamfaf'

the result : 'l2ea2mf2结果:'l2ea2mf2

def encode(input_string):
        count = 1
        prev = ''
        lst = []
        for character in input_string:
            if character != prev:
                if prev:
                    entry = (prev, count)
                    lst.append(entry)
                    #print lst
                count = 1
                prev = character
            else:
                count += 1
        else:
            entry = (character, count)
            lst.append(entry)
        return lst    


def decode(lst):
        q = ""
        for character, count in lst:
            q += character * count
        return q    


def main():
        s = 'emanuelshmuel'
        print decode(encode(s))    

if __name__ == "__main__":
        main()

Three remarks:三注:

  1. You should use the existing method str.count for the encode function.您应该将现有的方法str.count用于encode函数。
  2. The decode function will print count times a character, not the character and its counter. decode函数将打印一个字符的count次,而不是字符及其计数器。
  3. Actually the decode(encode(string)) combination is a coding function since you do not retrieve the starting string from the encoding result.实际上decode(encode(string))组合是一个编码函数,因为您没有从编码结果中检索起始字符串。

Here is a working code:这是一个工作代码:

def encode(input_string):
    characters = []
    result = ''
    for character in input_string:
        # End loop if all characters were counted
        if set(characters) == set(input_string):
            break
        if character not in characters:
            characters.append(character)
            count = input_string.count(character)
            result += character
            if count > 1:
                result += str(count)
    return result

def main():
        s = 'emanuelshmuel'
        print encode(s)
        assert(encode(s) == 'e3m2anu2l2sh')
        s = 'lelamfaf'
        print encode(s)
        assert(encode(s) == 'l2ea2mf2')

if __name__ == "__main__":
        main()

Came up with this quickly, maybe there's room for optimization (for example, if the strings are too large and there's enough memory, it would be better to use a set of the letters of the original string for look ups rather than the list of characters itself).这么快就想出来了,也许还有优化的空间(例如,如果字符串太大并且有足够的内存,最好使用原始字符串的一组字母而不是字符列表进行查找本身)。 But, does the job fairly efficiently:但是,这项工作相当有效:

text = 'lelamfaf'
counts = {s:text.count(s) for s in text}

char_lst = []
for l in text:
    if l not in char_lst:
        char_lst.append(l)
        if counts[l] > 1:
            char_lst.append(str(counts[l]))

encoded_str = ''.join(char_lst)
print encoded_str

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM