[英]Getting indices of capital letters with regex
I am trying to get the indices of capital letters (including special ones) in a line. 我正在尝试一行获取大写字母(包括特殊字母)的索引。 I found here the following solution:
我在这里找到以下解决方案:
[i for i, c in enumerate(s) if c.isupper()]
However, this does not work for letters like: Ö, Ä
and Ü
: 但是,这不适用于诸如
Ö, Ä
和Ü
字母:
I tried therefore: 因此,我尝试了:
[re.search(r'^([^A-ZÄÖÜ]*[A-ZÄÖÜ]){i}',s).span()[1] for i in range (1,y)]
where y is the number of capital letters in s
. 其中y是
s
中大写字母的数目。
The second solution works if I define i
, but under the loop, it returns: 如果我定义
i
,则第二个解决方案有效,但是在循环下,它返回:
attributeerror 'nonetype' object has no attribute 'span'.
attributeerror'nonetype'对象没有属性'span'。
How can I solve it in an efficient way? 如何有效解决问题?
The problem is that s
is represented in bytes. 问题是
s
用字节表示。 It needs just to be decoded to unicode: 它只需要解码为unicode:
s=u'ÖÄÜ' # str to unicode
[i for i, c in enumerate(s) if c.isupper()]
Python3: You can do that with isupper()
easily, no need for regex. Python3:您可以使用
isupper()
轻松做到这一点,不需要正则表达式。 Unfortunately, if you are using Python2.7 this will involve some nasty encoding/decoding, which I am not so familiar with. 不幸的是,如果您使用的是Python2.7,这将涉及一些令人讨厌的编码/解码,而我对此并不熟悉。
x = "HEY thats Some Lower Case ZÄÖÜ"
print([i for i in range(0, len(x)) if x[i].isupper() ])
>[0, 1, 2, 10, 15, 21, 26, 27, 28, 29]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.