简体   繁体   English

从给定的str中查找第一次出现str的索引

[英]find index of first occurrence of str from given str

Google or Amazone ask the following question in an interview, would my solution be accepted? Google或Amazone在接受采访时询问以下问题,我的解决方案会被接受吗?

problem: find the index of the first occurrence of the given word from the given string 问题:从给定字符串中查找给定单词的第一次出现的索引

note: Above problem is from a website and following code passed all the test cases. 注意:以上问题来自网站,以下代码通过了所有测试用例。 however, I am not sure if this is the most optimum solutions and so would be accepted by big giants. 但是,我不确定这是否是最佳解决方案,因此大型巨头会接受。

def strStr(A, B):
    if len(A) == 0 or len(B) == 0:
        return -1
    for i in range(len(A)):
        c = A[i:i+len(B)]
        if c == B:
            return i
    else:
        return -1

Python actually has a built in function for this, which is why this question doesn't seem like a great fit for interviews in python. Python实际上有一个内置函数,这就是为什么这个问题看起来不适合python中的采访。 Something like this would suffice: 这样的东西就足够了:

def strStr(A, B):
  return A.find(B)

Otherwise, as commenters have mentioned, inputs/outputs and tests are important. 否则,正如评论者提到的那样,输入/输出和测试很重要。 You could add some checks that make it slightly more performant (ie check that B is smaller than A), but I think in general, you won't do better than O(n). 您可以添加一些检查,使其性能稍微提高(即检查B小于A),但我认为一般来说,你不会做得比O(n)好。

If you want to match the entire word to the words in the string, your code would not work. 如果要将整个单词与字符串中的单词匹配,则代码将无效。
Eg If my arguments are print(strStr('world hello world', 'wor')) , your code would return 0, but it should return -1. 例如,如果我的参数是print(strStr('world hello world', 'wor')) ,你的代码将返回0,但它应该返回-1。

I checked your function, works well in python3.6 我检查了你的功能,在python3.6中运行良好

print(strStr('abcdef', 'bcd')) # with your function.    *index start from 0
print("adbcdef".find('bcd')) # python default function. *index start from 1

first occurrence index, use index() or find() 第一个出现索引,使用index()find()

text = 'hello i am homer simpson'

index = text.index('homer')
print(index)

index = text.find('homer')
print(index)

output:
11
11

It is always better to got for the builtin python funtions. 获得内置python功能总是更好。 But sometimes in the interviews they will ask for you to implemente it yourself. 但有时在采访中他们会要求你自己实施。 The best thing to do is to start with the simplest version, then think about corner cases and improvements. 最好的办法是从最简单的版本开始,然后考虑角落案例和改进。

Here you have a test with your version, a slightly improved one that avoid to reallocating new strings in each index and the python built-ing: 在这里,您可以使用您的版本进行测试,稍微改进一下,避免在每个索引和python内置中重新分配新字符串:

A = "aaa foo baz fooz bar aaa"
B = "bar"

def strInStr1(A, B):
    if len(A) == 0 or len(B) == 0:
        return -1
    for i in range(len(A)):
        c = A[i:i+len(B)]
        if c == B:
            return i
    else:
        return -1

def strInStr2(A, B):
  size = len(B)
  for i in range(len(A)):
    if A[i] == B[0]:
      if A[i:i+size] == B:
        return i
  return -1


def strInStr3(A, B):
  return A.index(B)


import timeit
setup = '''from __main__ import strInStr1, strInStr2, strInStr3, A, B'''
for f in  ("strInStr1", "strInStr2", "strInStr3"):
  result =  timeit.timeit(f"{f}(A, B)", setup=setup)
  print(f"{f}: ", result)

The results speak for themselves (time in seconds): 结果不言而喻(时间以秒为单位):

strInStr1:  15.809420814999612
strInStr2:  7.687011377005547
strInStr3:  0.8342400040055509

Here you have the live version 在这里你有实时版本

There are a few algorithms that you can learn on this topic like 您可以在此主题上学习一些算法,例如

rabin karp algorithm , z algorithm , kmpalgorithm rabin karp algorithmz algorithmkmpalgorithm z algorithm

which all run in run time complexity of O(n+m) where n is the string length and m is the pattern length. 它们都以O(n+m)运行时复杂度运行,其中n是字符串长度,m是模式长度。 Your algorithm runs in O(n*m) runtime complexity . 您的算法以O(n*m)运行时复杂度运行。 I would suggest starting to learn from rabin karp algorithm, I personally found it the easiest to grasp. 我建议开始学习rabin karp算法,我个人发现它最容易掌握。

There are also some advanced topics like searching many patterns in one string like the aho-corasick algorithm which is good to read. 还有一些高级主题,比如在一个字符串中搜索许多模式,如aho-corasick算法,这是一个很好的阅读。 I think this is what grep uses when searching for multiple patterns. 我认为这是grep在搜索多个模式时使用的。 Hope it helps :) 希望能帮助到你 :)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM