简体   繁体   English

使用正则表达式获取大写字母索引

[英]Getting indices of capital letters with regex

I am trying to get the indices of capital letters (including special ones) in a line. 我正在尝试一行获取大写字母(包括特殊字母)的索引。 I found here the following solution: 我在这里找到以下解决方案:

[i for i, c in enumerate(s) if c.isupper()]

However, this does not work for letters like: Ö, Ä and Ü : 但是,这不适用于诸如Ö, ÄÜ字母: 在此处输入图片说明

I tried therefore: 因此,我尝试了:

[re.search(r'^([^A-ZÄÖÜ]*[A-ZÄÖÜ]){i}',s).span()[1] for i in range (1,y)] 

where y is the number of capital letters in s . 其中y是s中大写字母的数目。

The second solution works if I define i , but under the loop, it returns: 如果我定义i ,则第二个解决方案有效,但是在循环下,它返回:

attributeerror 'nonetype' object has no attribute 'span'. attributeerror'nonetype'对象没有属性'span'。

How can I solve it in an efficient way? 如何有效解决问题?

The problem is that s is represented in bytes. 问题是s用字节表示。 It needs just to be decoded to unicode: 它只需要解码为unicode:

s=u'ÖÄÜ'       # str to unicode
[i for i, c in enumerate(s) if c.isupper()]

Python3: You can do that with isupper() easily, no need for regex. Python3:您可以使用isupper()轻松做到这一点,不需要正则表达式。 Unfortunately, if you are using Python2.7 this will involve some nasty encoding/decoding, which I am not so familiar with. 不幸的是,如果您使用的是Python2.7,这将涉及一些令人讨厌的编码/解码,而我对此并不熟悉。

x = "HEY thats Some Lower Case ZÄÖÜ"
print([i for i in range(0, len(x)) if x[i].isupper() ])
>[0, 1, 2, 10, 15, 21, 26, 27, 28, 29]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM