简体   繁体   English

查找并替换大写字符

[英]Find and replace the uppercase characters

i want to find and replace the uppercase characters (as _upperchar ) in a string. 我想查找并替换字符串中的大写字符(如_upperchar )。

Eg: input: HeLLo Capital Letters 例如:输入: HeLLo Capital Letters

output : _He_L_Lo _Capital _Letters 输出: _He_L_Lo _Capital _Letters

I tried like: 我尝试过:

print "saran"
value = "HeLLo Capital Letters"
for word in value:
        print word
        if word.isupper():
                char = "_"
                value = value.replace(word,char + word)

print value

and the output I got is, 我得到的输出是

_He___L___Lo _Capital ___Letters

Some one please help me to reduce the extra underscores. 请有人帮助我减少多余的下划线。

Take a look at re.sub 看看re.sub

>>> import re
>>> re.sub(r'([A-Z])', r'_\1', value)
'_He_L_Lo _Capital _Letters'

The issue in your example isn't that you're modifying the string whilst iterating over it. 您的示例中的问题不在于您在迭代字符串时正在修改字符串。 Python will create iter(value) at the start of the for loop, and changes to value after this wont effect the loop due to strings being immutable. Python将在for循环的开始处创建iter(value) ,并在此之后更改为value ,因为字符串不可更改,因此不会影响循环。 The problem is value.replace will replace all occurrences in the string, and as there are 3 capital Ls for example, each L will get 3 underscores ( value.replace('L', '_L') happens 3 times). 问题是value.replace将替换字符串中所有出现的内容,例如,由于存在3个大写L,每个L将得到3个下划线( value.replace('L', '_L') _ value.replace('L', '_L')发生3次)。

Just use str.join , add a _ before the ch if the ch/letter is uppercase, else just keep the letter/ch as is: 只需使用str.join ,如果ch / letter是大写字母,则在ch之前添加_ ,否则只需保持字母/ ch不变:

s=  "HeLLo Capital Letters"

print("".join(["_" + ch if ch.isupper() else ch for ch in s]))
_He_L_Lo _Capital _Letters

You run into issues because you are calling replace on the whole string each time so the repeated L's for example end up with three _ . 之所以遇到问题,是因为每次您都在整个字符串上调用replace,因此重复的L例如以3 _结尾。

If you add a print value,word at the start of the loop you will see what happens: 如果在循环开始时添加print value,word ,您将看到发生的情况:

HeLLo Capital Letters H
_HeLLo Capital Letters e
_HeLLo Capital Letters L
_He_LLo Capital Letters L # second L
_He__LLo Capital Letters o # after replacing twice we now have double _
 ........................

Some timings against a regex shows a list comp is the best approach: 使用正则表达式的一些时间表明,列表组合是最好的方法:

In [13]: s = s * 50

In [14]: timeit "".join(["_" + ch if ch.isupper() else ch for ch in s])
10000 loops, best of 3: 98.9 µs per loop

In [15]: timeit  r.sub( r'_\1', s)
1000 loops, best of 3: 296 µs per loop

Look closely what's happening as your code is executed. 仔细查看代码执行过程中发生的情况。 I've added some "print" statements that show what's going on: 我添加了一些“ print”语句来显示正在发生的事情:

Replacing 'H' with '_H':
    _HeLLo Capital Letters

Replacing 'L' with '_L':
    _He_L_Lo Capital _Letters

Replacing 'L' with '_L':
    _He__L__Lo Capital __Letters

Replacing 'C' with '_C':
    _He__L__Lo _Capital __Letters

Replacing 'L' with '_L':
    _He___L___Lo _Capital ___Letters

You run into multiple L characters, and perform the replacement L_L for each of them, so you get: 您遇到多个L字符,并对每个字符执行替换L_L ,因此得到:

L_L__L___L → ... L_L__L___L L →...

The other solutions here apply the replacement ( L_L ) on a character level, instead of on the whole string; 这里的其他解决方案将替换( L_L )应用于字符级别,而不是整个字符串。 that's why they work while yours doesn't. 这就是为什么他们工作而你的却不工作的原因。

The problem in your snippet is that when the first time you change H to _H, the next time you iterate, it considers H again because now it is in the second spot ! 您的代码段中的问题是,当您第一次将H更改为_H时,下次您进行迭代时,它将再次考虑H,因为现在它位于第二位置! hence instead of replacing, just create a new string. 因此,无需替换,只需创建一个新字符串即可。

value = "HeLLo Capital Letters"
new_value = ""
for word in value:
        #print(word)
        if word.isupper():
                char = "_"
                new_value += char + word
        else:
            new_value += word

print(new_value) 

if an uppercase char is encountered, first condition is executed otherwise the lowercase char is simply appended 如果遇到大写字符,则执行第一个条件,否则将简单地附加小写字符

print "saran"
value = "HeLLo Capital Letters"
print ''.join(['_'+ x if x.isupper() else x for x in value])
value = "HELLO Capital Letters"         
for word in value:                      
    str = ""                            
    if word.isupper():                  
        val = word                      
    output=word.replace(val, "_"+word)  
    str = str + output                  
    print str                           

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM