简体   繁体   English

最清晰,Python式,可靠且最快的方法来检查字符串是否包含列表中的单词

[英]Most clear, Pythonic, reliable, and fastest way to check if a string contains words from a list of lists

I am looking for the most clear, Pythonic, and fastest way to check if a string contains words from a list of lists 我正在寻找最清晰,Pythonic和最快的方法来检查字符串是否包含列表中的单词

This is what I came up so far 这是我到目前为止提出的

introStrings = ['introduction:' , 'case:' , 'introduction' , 'case' ]
backgroundStrins = ['literature:' , 'background:',  'Related:' , 'literature' , 'background',  'related' ]
methodStrings = [ 'methods:' , 'method:', 'techniques:', 'methodology:' , 'methods' , 'method', 'techniques', 'methodology' ]
resultStrings = [ 'results:', 'result:', 'experimental:', 'experiments:', 'experiment:', 'results', 'result', 'experimental', 'experiments', 'experiment']
discussioStrings = [ 'discussion:' , 'Limitations:'  , 'discussion' , 'limitations']
conclusionStrings = ['conclusion:' , 'conclusions:', 'concluding:' , 'conclusion' , 'conclusions', 'concluding' ]

allStrings = [ introStrings, backgroundStrins, methodStrings, resultStrings, discussioStrings, conclusionStrings ]

testtt = 'this may thod be in techniques ever material and methods'

for item in allStrings:
    for word in testtt.split():
        if word in item:
            print('yes')
            break

This code pretty looks for all combinations. 此代码可以很好地查找所有组合。 It's a nested for loop. 这是一个嵌套的for循环。 It it's not quite clear to figure out on first glance. 乍一看还不清楚。

I am wondering if there is a better way. 我想知道是否有更好的方法。

It would be more Pythonic to use any() with a chained list comprehension: any()与链式列表理解一起使用会更加Pythonic:

print any(word in sublist for word in testtt.split() for sublist in allStrings)

However this will just return true/false; 但是,这只会返回true / false。 it won't identify which word was found in which sublist. 它不会识别在哪个子列表中找到哪个词。 You can print the specific matches with this list comprehension: 您可以使用此列表理解来打印特定的匹配项:

print [(word,sublist) for word in testtt.split() for sublist in allStrings if word in sublist]

Your code is a bit wasteful by calculating testtt.split() more than once. 通过testtt.split()计算testtt.split()您的代码有点浪费。

I am looking for the most clear, Pythonic, and fastest way to check if a string contains words from a list of lists 我正在寻找最清晰,Pythonic和最快的方法来检查字符串是否包含列表中的单词

First, I'd flatten the lists 首先,我将列表弄平

all_strings = [*intro, *back, *methods, ...] # You get the idea

(Alternatively, using a nested list comprehension) (或者,使用嵌套列表理解)

all_strings = [word for list in [intro, back, ...] for word in list] # if you're into that

Next, split the string: 接下来,分割字符串:

string_words = a_string.split()

Finally, just look up words: 最后,只需查找以下单词:

found = [w for w in string_words if w in all_strings]

That's quite pythonic, not very sure about speed or reliability 那是很Python的,对速度或可靠性不是很确定

What I can get is by use of chain and any : 我所能得到的是通过使用chainany

resultStrings = [
    "results:",
    "result:",
    "experimental:",
    "experiments:",
    "experiment:",
    "results",
    "result",
    "experimental",
    "experiments",
    "experiment",
]
conclusionStrings = [
    "conclusion:",
    "conclusions:",
    "concluding:",
    "conclusion",
    "conclusions",
    "concluding",
]

allStrings = [resultStrings, conclusionStrings]
testtt = "this may thod be in techniques ever material and methods"

from itertools import chain
string_set = set(chain(*allStrings))
any(i in string_set for i in testtt.split())

Though set need some space, it can improve efficiency. 虽然set需要一些空间,它可以提高工作效率。 Thanks Peter Wood. 谢谢彼得·伍德。

Using itertools 使用itertools

import itertools
merged = list(itertools.chain.from_iterable(allStrings))
[print(x) for x in testtt.split() if x in merged]

暂无
暂无

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

相关问题 检查字符串是否包含列表中字符串的最快方法 - Fastest way to check if a string contains a string from a list 大多数“pythonic”方式来检查列表的子列表的排序? - Most “pythonic” way to check ordering of sub lists of a list? 什么是将字符串拆分为连续,重叠的单词列表的最pythonic方法 - What is the most pythonic way to split a string into contiguous, overlapping list of words 列表中最常见的单词,其中包含列表 - most frequent words in list which contains lists 从一组嵌套字典中创建键值对列表的大多数pythonic和最快的方法? - Most pythonic and fastest way to create a list of key value pairs from a set of nested dictionaries? 将列表的字典转换为字典中所有元素的所有组合的字典列表的最有效方法? - Most pythonic way to convert a dict of lists into a list of dicts of all combs of the elements in the lists from the dict? 检查文件是否包含字符串列表中任何字符串的最快方法 - Fastest way to check whether a file contains any string from a list of strings 遍历一长串字符串并从原始列表构建新列表的最pythonic 方法是什么? - What is the most pythonic way to iterate through a long list of strings and structure new lists from that original list? 检查列表列表中是否存在列表的最快方法 - Fastest way to check if a list is present in a list of lists 从列表创建列表列表的Python方式 - Pythonic way of creating list of lists from list
 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM