简体   繁体   English

如何在列表/字典/等中“任意”格式化项目。 EX:更改列表中每个字符串的第4个字符

[英]HOW TO “Arbitrary” format items in list/dict/etc. EX: change 4th character in every string in list

first of all i want to mention that there might not be any real life applications for this simple script i created, but i did it because I'm learning and I couldn't find anything similar here in SO. 首先,我想提一提,我创建的这个简单脚本可能没有任何现实生活中的应用程序,但是我这样做是因为我正在学习,所以在SO中找不到类似的东西。 I wanted to know what could be done to "arbitrarily" change characters in an iterable like a list. 我想知道如何“随意”更改列表中的可重复字符。

Sure tile() is a handy tool I learned relatively quick, but then I got to think what if, just for kicks, i wanted to format (upper case) the last character instead? 当然tile()是一个相对方便的工具,我学得比较快,但是后来我想如果只是为了踢一下,我想格式化(大写)最后一个字符怎么办? or the third, the middle one,etc. 或第三个,中间一个等 What about lower case? 小写怎么办? Replacing specific characters with others? 用其他人替换特定字符?

Like I said this is surely not perfect but could give away some food for thought to other noobs like myself. 就像我说的那样,这当然不是完美的,但可以给其他像我这样的菜鸟带来一些思考的食物。 Plus I think this can be modified in hundreds of ways to achieve all kinds of different formatting. 另外,我认为可以用数百种方式修改此格式,以实现各种不同的格式。

How about helping me improve what I just did? 如何帮助我改善刚刚做的事情? how about making it more lean and mean? 让它变得更苗条又卑鄙的怎么样? checking for style, methods, efficiency, etc... 检查样式,方法,效率等...

Here it goes: 它去了:

words = ['house', 'flower', 'tree']  #string list

counter = 0                          #counter to iterate over the items in list
chars = 4                            #character position in string (0,1,2...)

for counter in range (0,len(words)): 
    while counter < len(words):
        z = list(words[counter])     # z is a temp list created to slice words
        if len(z) > chars:           # to compare char position and z length
            upper = [k.upper() for k in z[chars]] # string formatting EX: uppercase
            z[chars] = upper [0]     # replace formatted character with original
            words[counter] = ("".join(z)) # convert and replace temp list back into original word str list
            counter +=1
        else:
            break

print (words)

['housE', 'flowEr', 'tree']

There's much better Pythonistas than me, but here's one attempt: 有比我更好的Pythonista,但这是一种尝试:

[''.join(
      [a[x].upper() if x == chars else a[x]
          for x in xrange(0,len(a))]
    )
    for a in words]

Also, we're talking about the programmer's 4th, right? 另外,我们谈论的是程序员的第四,对吗? What everyone else calls 5th, yes? 其他人都叫第五,是吗?

I think the general case of what you're talking about is a method that, given a string and an index, returns that string, with the indexed character transformed according to some rule. 我认为您正在谈论的一般情况是一种方法,给定一个字符串和一个索引,然后返回该字符串,并根据某些规则转换索引的字符。

def transform_string(strng, index, transform):
    lst = list(strng)
    if index < len(lst):
        lst[index] = transform(lst[index])
    return ''.join(lst)


words = ['house', 'flower', 'tree']
output = [transform_string(word, 4, str.upper) for word in words]

To make it even more abstract, you could have a factory that returns a method, like so: 为了使它更加抽象,可以有一个工厂返回一个方法,如下所示:

def transformation_factory(index, transform):
    def inner(word):
        lst = list(word)
        if index < len(lst):
            lst[index] = transform(lst[index])
    return inner
transform = transformation_factory(4, lambda x: x.upper())
output = map(transform, words)

This is somewhat of a combination of both (so +1 to both of them :) ). 这有点是两者的组合(所以对它们+1 :))。 The main function accepts a list, an arbitrary function and the character to act on: main函数接受一个列表,一个任意函数和要操作的字符:

In [47]: def RandomAlter(l, func, char):
    return [''.join([func(w[x]) if x == char else w[x] for x in xrange(len(w))]) for w in l]
   ....:

In [48]: RandomAlter(words, str.upper, 4)
Out[48]: ['housE', 'flowEr', 'tree']

In [49]: RandomAlter([str.upper(w) for w in words], str.lower, 2)
Out[49]: ['HOuSE', 'FLoWER', 'TReE']

In [50]: RandomAlter(words, lambda x: '_', 4)
Out[50]: ['hous_', 'flow_r', 'tree']

The function RandomAlter can be rewritten as this, which may make it a bit more clear (it takes advantage of a feature called list comprehensions to reduce the lines of code needed). 可以这样重写函数RandomAlter ,这可能会使它更加清晰(它利用称为列表 RandomAlter的功能来减少所需的代码行)。

def RandomAlter(l, func, char):
    # For each word in our list
    main_list = []
    for w in l:
        # Create a container that is going to hold our new 'word'
        new_word = []
        # Iterate over a range that is equal to the number of chars in the word
        # xrange is a more memory efficient 'range' - same behavior
        for x in xrange(len(w)):
            # If the current position is the character we want to modify
            if x == char:
                # Apply the function to the character and append to our 'word'
                # This is a cool Python feature - you can pass around functions
                # just like any other variable
                new_word.append(func(w[x]))
            else:
                # Just append the normal letter
                new_word.append(w[x])

        # Now we append the 'word' to our main_list. However since the 'word' is
        # a list of letters, we need to 'join' them together to form a string
        main_list.append(''.join(new_word))

    # Now just return the main_list, which will be a list of altered words
    return main_list

Some comments on your code: 关于您的代码的一些注释:

for counter in range (0,len(words)):     
while counter < len(words):

This won't compile unless you indent the while loop under the for loop. 除非您在for循环下缩进while循环,否则不会编译。 And, if you do that, the inner loop will completely screw up the loop counter for the outer loop. 而且,如果您这样做,则内部循环将完全拧紧外部循环的循环计数器。 And finally, you almost never want to maintain an explicit loop counter in Python. 最后,您几乎永远都不想在Python中维护显式循环计数器。 You probably want this: 您可能想要这样:

for counter, word in enumerate(words):

Next: 下一个:

z = list(words[counter])     # z is a temp list created to slice words

You can already slice strings, in exactly the same way you slice lists, so this is unnecessary. 您已经可以使用与切片列表完全相同的方式对字符串进行切片,因此这是不必要的。

Next: 下一个:

    upper = [k.upper() for k in z[chars]] # string formatting EX: uppercase

This is a bad name for the variable, since there's a function with the exact same name—which you're calling on the same line. 这是变量的坏名字,因为有一个函数具有完全相同的名称,即您在同一行上调用的名称。

Meanwhile, the way you defined things, z[chars] is a character, a copy of words[4] . 同时,您定义事物的方式z[chars]是一个字符,是words[4]的副本。 You can iterate over a single character in Python, because each character is itself a string. 您可以在Python中迭代单个字符,因为每个字符本身就是一个字符串。 but it's generally pointless— [k.upper() for k in z[chars]] is the same thing as [z[chars].upper()] . 但它通常毫无意义- [k.upper() for k in z[chars]] [z[chars].upper()][z[chars].upper()]是同一回事。

    z[chars] = upper [0]     # replace formatted character with original

So you only wanted the list of 1 character to get the first character out of it… why make it a list in the first place? 因此,您只希望从1个字符的列表中取出第一个字符…为什么要首先将其制成列表? Just replace the last two lines with z[chars] = z[chars].upper() . 只需将最后两行替换为z[chars] = z[chars].upper()

else:
    break

This is going to stop on the first string shorter than length 4, rather than just skip strings shorter than length 4, which is what it seems like you want. 这将在短于长度4的第一个字符串上停止,而不是跳过短于长度4的字符串,这似乎是您想要的。 The way to say that is continue , not break . 说的方式是continue ,而不是break Or, better, just fall off the end of the list. 或者更好的是,从列表的末尾掉下来。 In some cases, it's hard to write things without a continue , but in this case, it's easy—it's already at the end of the loop, and in fact it's inside an else: that has nothing else in it, so just remove both lines. 在某些情况下,如果不使用continue很难编写东西,但是在这种情况下,这很容易-它已经在循环的尽头,并且实际上在另一个循环内else:里面没有其他内容,因此只需删除两行。

It's hard to tell with upper that your loops are wrong, because if you accidentally call upper twice, it looks the same as if you called it once. 很难用upper来判断循环是错误的,因为如果您不小心两次调用了upper ,它看起来就好像您调用了一次一样。 Change the upper to chr(ord(k)+1) , which replaces any letter with the next letter. upper字母更改为chr(ord(k)+1) ,用下一个字母替换任何字母。 Then try it with: 然后尝试:

words = ['house', 'flower', 'tree', 'a', 'abcdefgh']

You'll notice that, eg, you get 'flowgr' instead of 'flowfr' . 您会注意到,例如,您得到的是'flowgr'而不是'flowfr'

You may also want to add a variable that counts up the number of times you run through the inner loop. 您可能还想添加一个变量,该变量计算通过内部循环运行的次数。 It should only be len(words) times, but it's actually len(words) * len(words) if you have no short words, or len(words) * len(<up to the first short word>) if you have any. 它应该仅是len(words)次,但是如果您没有短词,则实际上是len(words) * len(words) ,或者如果有任何话,则为len(words) * len(<up to the first short word>) You're making the computer do a whole lot of extra work—if you have 1000 words, it has to do 1000000 loops instead of 1000. In technical terms, your algorithm is O(N^2), even though it only needs to be O(N). 您正在让计算机做大量的额外工作-如果您有1000个单词,则它必须执行1000000循环而不是1000。循环从技术上讲,即使您只需要执行以下操作,您的算法也是O(N ^ 2)是O(N)。

Putting it all together: 放在一起:

words = ['house', 'flower', 'tree', 'a', 'abcdefgh']  #string list
chars = 4                            #character position in string (0,1,2...)

for counter, word in enumerate(words): 
    if len(word) > chars:           # to compare char position and z length
        z = list(word)
        z[chars] = chr(ord(z[chars]+1) # replace character with next character
        words[counter] = "".join(z)    # convert and replace temp list back into original word str list

print (words)

That does the same thing as your original code (except using "next character" instead of "uppercase character"), without the bugs, with much less work for the computer, and much easier to read. 这与原始代码具有相同的作用(除了使用“下一个字符”而不是“大写字符”),没有错误,计算机工作量更少,并且更易于阅读。

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM