简体   繁体   English

如何用字符串替换单词列表并保持格式在python中?

[英]How to replace a list of words with a string and keep the formatting in python?

I have a list containing the lines of a file. 我有一个包含文件行的列表。

list1[0]="this is the first line"
list2[1]="this is the second line"

I also have a string. 我也有一个字符串。

example="TTTTTTTaaaaaaaaaabcccddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeefffff"

I want to replace list[0] with the string (example). 我想用字符串替换list [0](示例)。 However I want to keep the word length. 但是我想保持字长。 For example the new list1[0] should be "TTTT TT TTa aaaaa aaaa" . 例如,新的list1 [0]应该为"TTTT TT TTa aaaaa aaaa" The only solution I could come up with was to turn the string example into a list and use a for loop to read letter by letter from the string list into the original list. 我唯一能想到的解决方案是将字符串示例转换为列表,并使用for循环将字符串列表中的字母逐字母读取到原始列表中。

for line in open(input, 'r'):
        list1[i] = listString[i]
        i=i+1

However this does not work from what I understand because Python strings are immutable? 但是,根据我的理解,这是行不通的,因为Python字符串是不可变的? What's a good way for a beginner to approach this problem? 对于初学者来说,解决此问题的好方法是什么?

I'd probably do something like: 我可能会做类似的事情:

orig = "this is the first line"
repl = "TTTTTTTaaaaaaaaaabcccddeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeefffff"

def replace(orig, repl):
    r = iter(repl)
    result = ''.join([' ' if ch.isspace() else next(r) for ch in orig])
    return result

If repl could be shorter than orig , consider r = itertools.cycle(repl) 如果repl可以比orig短,请考虑r = itertools.cycle(repl)

This works by creating an iterator out of the replacement string, then iterating over the original string, keeping the spaces, but using the next character from the replacement string instead of any non-space characters. 这是通过在替换字符串之外创建一个迭代器,然后在原始字符串上进行迭代,并保留空格,但使用替换字符串中的下一个字符而不是任何非空格字符来实现的。

The other approach you could take would be to note the indexes of the spaces in one pass through orig , then insert them at those indexes in a pass of repl and return a slice of the result 您可以采用的另一种方法是在一次orig传递中记录空格的索引,然后在repl传递中将它们插入这些索引,并返回结果的一部分

def replace(orig, repl):
    spaces = [idx for idx,ch in enumerate(orig) if ch.isspace()]
    repl = list(repl)
    for idx in spaces:
        repl.insert(idx, " ")
        # add a space before that index
    return ''.join(repl[:len(orig)])

However I couldn't imagine the second approach to be any faster, is certain to be less memory-efficient, and I don't find it easier to read (in fact I find it HARDER to read!) It also don't have a simple workaround if repl is shorter than orig (I guess you could do repl *= 2 but that's uglier than sin and still doesn't guarantee it'll work) 但是我无法想象第二种方法会更快,肯定会降低内存效率,而且我发现它不容易阅读(实际上我发现它更难阅读!)它也没有一个简单的解决方法,如果replorig短(我猜你可以做repl *= 2但这比sin丑陋,但仍然不能保证它会工作)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM