[英]How to find char in string and get all the indexes?
I got some simple code:我得到了一些简单的代码:
def find(str, ch):
for ltr in str:
if ltr == ch:
return str.index(ltr)
find("ooottat", "o")
The function only return the first index.该函数只返回第一个索引。 If I change return to print, it will print 0 0 0. Why is this and is there any way to get
0 1 2
?如果我改变 return 打印,它会打印 0 0 0。为什么会这样,有什么办法可以得到
0 1 2
?
This is because str.index(ch)
will return the index where ch
occurs the first time.这是因为
str.index(ch)
将返回ch
第一次出现的索引。 Try:尝试:
def find(s, ch):
return [i for i, ltr in enumerate(s) if ltr == ch]
This will return a list of all indexes you need.这将返回您需要的所有索引的列表。
PS Hugh's answer shows a generator function (it makes a difference if the list of indexes can get large). PS Hugh 的回答显示了一个生成器函数(如果索引列表变大会有所不同)。 This function can also be adjusted by changing
[]
to ()
.也可以通过将
[]
更改为()
来调整此功能。
I would go with Lev, but it's worth pointing out that if you end up with more complex searches that using re.finditer may be worth bearing in mind (but re's often cause more trouble than worth - but sometimes handy to know)我会和 Lev 一起去,但值得指出的是,如果你最终得到更复杂的搜索,那么使用 re.finditer 可能值得牢记(但 re 通常会造成比价值更多的麻烦 - 但有时很容易知道)
test = "ooottat"
[ (i.start(), i.end()) for i in re.finditer('o', test)]
# [(0, 1), (1, 2), (2, 3)]
[ (i.start(), i.end()) for i in re.finditer('o+', test)]
# [(0, 3)]
Lev's answer is the one I'd use, however here's something based on your original code: Lev 的答案是我会使用的答案,但是这里有一些基于您的原始代码的内容:
def find(str, ch):
for i, ltr in enumerate(str):
if ltr == ch:
yield i
>>> list(find("ooottat", "o"))
[0, 1, 2]
def find_offsets(haystack, needle):
"""
Find the start of all (possibly-overlapping) instances of needle in haystack
"""
offs = -1
while True:
offs = haystack.find(needle, offs+1)
if offs == -1:
break
else:
yield offs
for offs in find_offsets("ooottat", "o"):
print offs
results in结果是
0
1
2
def find_idx(str, ch):
yield [i for i, c in enumerate(str) if c == ch]
for idx in find_idx('babak karchini is a beginner in python ', 'i'):
print(idx)
output:输出:
[11, 13, 15, 23, 29]
x = "abcdabcdabcd"
print(x)
l = -1
while True:
l = x.find("a", l+1)
if l == -1:
break
print(l)
As the rule of thumb, NumPy arrays often outperform other solutions while working with POD, Plain Old Data.根据经验,NumPy 数组在处理 POD、Plain Old Data 时通常优于其他解决方案。 A string is an example of POD and a character too.
字符串也是 POD 和字符的一个例子。 To find all the indices of only one char in a string, NumPy ndarrays may be the fastest way:
要查找字符串中仅一个字符的所有索引,NumPy ndarrays 可能是最快的方法:
def find1(str, ch):
# 0.100 seconds for 1MB str
npbuf = np.frombuffer(str, dtype=np.uint8) # Reinterpret str as a char buffer
return np.where(npbuf == ord(ch)) # Find indices with numpy
def find2(str, ch):
# 0.920 seconds for 1MB str
return [i for i, c in enumerate(str) if c == ch] # Find indices with python
Get all the position in just one line在一行中获取所有位置
word = 'Hello'
to_find = 'l'
# in one line
print([i for i, x in enumerate(word) if x == to_find])
Using pandas we can do this and return a dict with all indices, simple version: 使用pandas我们可以这样做并返回带有所有索引的dict,简单版本:
import pandas as pd
d = (pd.Series(l)
.reset_index()
.groupby(0)['index']
.apply(list)
.to_dict())
But we can build in conditions too, eg only if two or more occurences: 但我们也可以建立条件,例如,只有两个或更多的出现:
d = (pd.Series(l)
.reset_index()
.groupby(0)['index']
.apply(lambda x: list(x) if len(list(x)) > 1 else None)
.dropna()
.to_dict())
This is slightly modified version of Mark Ransom 's answer that works if ch
could be more than one character in length.这是Mark Ransom答案的略微修改版本,如果
ch
长度可能超过一个字符,则该答案有效。
def find(term, ch):
"""Find all places with ch in str
"""
for i in range(len(term)):
if term[i:i + len(ch)] == ch:
yield i
All the other answers have two main flaws:所有其他答案都有两个主要缺陷:
def findall(haystack, needle):
idx = -1
while True:
idx = haystack.find(needle, idx+1)
if idx == -1:
break
yield idx
This iterates through haystack
looking for needle
, always starting at where the previous iteration ended.这会在
haystack
迭代寻找needle
,总是从上一次迭代结束的地方开始。 It uses the builtin str.find
which is much faster than iterating through haystack
character-by-character.它使用内置的
str.find
,这比逐字符迭代haystack
快得多。 It doesn't require any new imports.它不需要任何新的进口。
To embellish the five-star one-liner posted by @Lev and @Darkstar:为了修饰@Lev 和@Darkstar 发布的五星级单线:
word = 'Hello'
to_find = 'l'
print(", ".join([str(i) for i, x in enumerate(word) if x == to_find]))
This just makes the separation of index numbers more obvious.这只是使索引号的分离更加明显。
Result will be: 2, 3
结果将是:
2, 3
You could try this你可以试试这个
def find(ch,string1):
for i in range(len(string1)):
if ch == string1[i]:
pos.append(i)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.