[英]How to find all occurences of a substring in a numpy string array
I'm trying to find all occurences of a substring in a numpy string array.我试图在 numpy 字符串数组中查找 substring 的所有出现。 Let's say:比方说:
myArray = np.array(['Time', 'utc_sec', 'UTC_day', 'Utc_Hour'])
sub = 'utc'
It should be case insensitive, so the method should return [1,2,3].它应该不区分大小写,因此该方法应该返回 [1,2,3]。
A vectorized approach using np.char.lower
and np.char.find
使用np.char.lower
和np.char.find
的矢量化方法
import numpy as np
myArray = np.array(['Time', 'utc_sec', 'UTC_day', 'Utc_Hour'])
res = np.where(np.char.find(np.char.lower(myArray), 'utc') > -1)[0]
print(res)
Output Output
[1 2 3]
The idea is to use np.char.lower
to make np.char.find
case-insensitive , then fetch the indices that contains the sub-string using np.where
.这个想法是使用np.char.lower
使np.char.find
不区分大小写,然后使用np.where
获取包含子字符串的索引。
You can use if sub in string
to check it.您可以使用if sub in string
来检查它。
import numpy as np
myArray = np.array(['Time', 'utc_sec', 'UTC_day', 'Utc_Hour'])
sub = 'utc'
count = 0
found = []
for item in myArray:
if sub in item.lower():
count += 1
found.append(count)
print(found)
output: output:
[1, 2, 3]
We can use list comprehension
te get the right indexes:我们可以使用列表comprehension
来获得正确的索引:
occ = [i for i in range(len(myArray)) if 'utc' in myArray[i].lower()]
Output Output
>>> print(occ)
... [1, 2, 3]
Let's make a general use from this question: we will set up a function returning occurences indexes of any
sub-character inside a numpy string array
.让我们从这个问题做一个一般性的使用:我们将设置一个 function 返回numpy string array
中any
字符的出现索引。
get_occ_idx(sub, np_array):
""" Occurences index of substring in a numpy string array
"""
assert sub.islower(), f"Your substring '{sub}' must be lower case (should be : {sub.lower()})"
assert all(isinstance(x, str)==False for x in np_array), "All items in the array must be strings"
assert all(sub in x.lower() for x in np_array), f"There is no occurence of substring :'{sub}'"
occ = [i for i in range(len(np_array)) if sub in np_array[i].lower()]
return occ
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.