简体   繁体   English

找到以下内容:列表中存储的任何一个子串(以先到者为准); 在Python中更大的字符串中

[英]Find the occurrence of: any one of the substrings (whichever first) stored in a list; in a bigger string in Python

I'm new to Python. 我是Python的新手。 I've gone through other answers.. I can say with some assurance that this may not be a duplicate. 我已经通过了其他答案。我可以肯定地说这可能不是重复的。

Basically; 基本上; let us say for example I want to find the occurrence of one of the substrings (stored in a list); 让我们说例如,我想找到一个子串的出现(存储在列表中); and if found? 如果找到了? I want it to stop searching for the other substrings of the list! 我希望它停止搜索列表的其他子串!

To illustrate more clearly; 为了更清楚地说明;

a = ['This', 'containing', 'many']
string1 = "This is a string containing many words"

If you ask yourself, what is the first word in the bigger string string1 that matches with the words in the list a ? 如果你问自己,更大的字符串string1中的第一个单词与列表中的单词匹配a什么? The answer will be This , because the first word in the bigger string string1 that has a match with list of substrings a is This 答案是This的,因为在大串的第一个字string1 ,有一个匹配的子字符串列表aThis

a = ['This', 'containing', 'many']
string1 = "kappa pride pogchamp containing this string this many words"

Now, I've changed string1 a bit. 现在,我已经改变了一下string1 If you ask yourself, what is the first word in the bigger string string1 that matches with the words in the list a ? 如果你问自己,更大的字符串string1中的第一个单词与列表中的单词匹配a什么? The answer will be containing , because the word containing is the first word that appears in the bigger string string1 that also has a match in the list of substrings a . 答案是containing ,因为这个词containing的是,出现在大串的第一个字string1还具有在子列表中匹配a

and if such a match is found? 如果找到这样的匹配? I want it to stop searching for any more matches! 我希望它停止搜索更多匹配!

I tried this: 我试过这个:

string1 = "This is a string containing many words"

a = ['This', 'containing', 'many']

if any(x in string1 for x in a):

    print(a)

else:
    print("Nothing found")

The above code, prints the entire list of substrings. 上面的代码打印了整个子串列表。 In other words, it checks for the occurrence of ANY and ALL of the substrings in the list a , and if found; 换句话说,它检查列表a是否存在任何和所有子串,如果找到则; it prints the entire list of substrings. 它打印整个子串列表。

I've also tried looking up String find() method but I can't seem to understand how to exactly use it in my case 我也试过查找String find()方法,但我似乎无法理解如何在我的情况下使用它

I'm looking for; 我在找; to word it EXACTLY: The first WORD in the bigger string that matches any of the list of words in the substring and print that word. 完全说出来: 大字符串中的第一个WORD匹配子字符串中的任何单词列表并打印该单词。

or 要么

to find WHICHEVER SUBSTRING (stored in a list of SUBSTRINGS) appears first in a BIGGER STRING and PRINT that particular SUBSTRING. 找到WHICHEVER SUBSTRING(存储在SUBSTRINGS列表中)首先出现在BIGGER STRING中并打印特定的SUBSTRING。

You could use a set membership check + next here. 你可以使用set成员资格检查+ next在这里。

>>> a = {'This', 'containing', 'many'}
>>> next((v for v in string1.split() if v in a), 'Nothing Found!')
'This'

This should give you (possibly better than) O(N) performance, since we're using next to find just the first value, and set membership tests are constant time. 这应该给你(可能更好)O(N)性能,因为我们使用next来找到第一个值,并且设置成员资格测试是恒定时间。

I think this can be done without splitting the string1 instead by matching the elements of the list. 我认为这可以通过匹配列表的元素而不拆分string1来完成。 For the first match use break to stop execution. 对于第一场比赛,使用break来停止执行。

string1 = "This is a string containing many words"
a = ['This', 'containing', 'many']

for x in a:
    if x in string1:
        print(x)
        break
else:
    print("Nothing found")

List comprehension 列表理解

l=[x for x in a if x in string1]
if l:
    print(l[0])
else:
    print("Nothing found")

You can use re here. 你可以在这里使用re

import re
a = ['This', 'containing', 'many']
string1 = "kappa pride pogchamp containing this string this many words"
print re.search(r"\b(?:"+"|".join(a)+r")\b", string1).group()

Output: 输出:

containing


s="""
a = ['This', 'containing', 'many']
a=set(a)
string1 = 'is a string containing many words This '
c=next((v for v in string1.split() if v in a), 'Nothing Found!')
"""
s1="""
a = ['This', 'containing', 'many']
string1 = "is a string containing many words This "
re.search(r"\b(?:"+"|".join(a)+r")\b", string1)
"""
print timeit.timeit(stmt=s,number=1000000)
print timeit.timeit(stmt=s1,number=1000000, setup="import re")

There are two ways you could approach this. 有两种方法可以解决这个问题。 One is using the 一个是使用

string.find('substring') string.find( '子串')

method that will return the index of the first occurence of 'substring' in string1, or presumably return -1 if there is no occurence of 'substring' in string1. 将返回string1中第一次出现'substring'的索引的方法,或者如果string1中没有出现'substring',则可能返回-1。 By iterating over the list of search terms a, you would have a collection of indicies, each corresponding to one word in your list. 通过遍历搜索项列表a,您将拥有一组标记,每个标记对应于列表中的一个单词。 The smallest non-negative_one value in your list would be the index of your first word. 列表中最小的非negative_one值将是您的第一个单词的索引。 This is very complex but would not require any sort of loop over the actual string. 这非常复杂,但不需要在实际字符串上进行任何循环。

Another alternative would be to use 另一种选择是使用

string1.split(' ') string1.split('')

to create a list of all of the words in the string. 创建字符串中所有单词的列表。 Then you could go through this list with a for each loop and check if each item in your string1 list corresponds to any of the other items. 然后,您可以通过for循环查看此列​​表,并检查string1列表中的每个项目是否与任何其他项目相对应。 This would be a great learning opportunity to try on your own, but let me know if I was too vague or if code would be more helpful. 这将是一个很好的学习机会,可以自己尝试,但如果我太模糊或代码会更有帮助,请告诉我。

Hope this helps! 希望这可以帮助!

a = ['This', 'containing', 'many']
string1 = "kappa pride pogchamp containing this string this many words"

Break is better option but that solution is already there so i wanted to show you can do in with slice too: 休息是更好的选择,但该解决方案已经存在,所以我想表明你也可以用切片做:

print("".join([item for item in string1.split() if item in a][:1]))

Above list comprehension is same as: 以上列表理解与以下相同:

new=[]
for item in string1.split():
    if item in a:
        new.append(item)

print("".join(new[:1]))
a = ['This', 'containing', 'many']
string1 = "kappa pride pogchamp containing this string this many words"

newList = string1.split(" ");
for i in newList:
    if i in a:
        print(i);
        break

This will do. 这样做。

For more read this. 欲了解更多信息。 https://docs.python.org/2/library/string.html https://docs.python.org/2/library/string.html

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 在Python中的另一个列表中查找一个列表中任何项目的最后一次出现 - Find last occurrence of any item of one list in another list in python Python搜索字符串以查找列表中任何项目的第一次出现 - Python to search a string for the first occurrence of any item in a list Python - 查找字符串中第一次出现的字符串列表的索引位置 - Python - find index position of first occurrence of a list of strings within a string 如何测试字符串是否包含存储在熊猫列表列中的子字符串之一? - How to test if a string contains one of the substrings stored in a list column in pandas? 如何从python中另一个字符串的列表中找到字符串的首次出现 - How can I find a first occurrence of a string from a list in another string in python (Python) 如何删除字符串中所有相邻的子字符串并只保留第一次出现的字符串? - (Python) How do I remove all adjacent substrings in a string and keep only the first occurrence? python:根据任何非数字字符的首次出现来拆分字符串 - python: split a string based on first occurrence of any nonnumeric character 从python中的一长串字符串中查找并删除一些子字符串 - find and remove some substrings from a long list of string in python Python:有没有一种方法可以查找和删除字符串中字符的第一个和最后一个出现的位置? - Python: Is there a way to find and remove the first and last occurrence of a character in a string? 使用python查找字符串中的子字符串 - Find substrings in string using python
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM