简体   繁体   中英

Specific word search in long text for python

Very basic question but is there a way for me to extract a string in a list that contains a word that I want? Something like:

wordNeeded=str(input("blue or red?"))

list1=["A blue car", "A blue bike", "A red bike"]

and then it'll extract the strings which contain the exact word in wordNeeded?

Among other ways, you could use a list comprehension:

list1 = ["A blue car", "A blue bike", "A red bike"]
result = [item for item in list1 if wordNeeded in item]
print(result)
# ["A red bike"]

Alternatively, you could look into filter in combination with a lambda function:

result = filter(lambda x: wordNeeded in x, list1)
print(list(result))

The latter is more complicated in this case but yields the same result.


As for exact words, you either need to split each item before (+eventually lowercase it):

 wordNeeded = "blue" list1 = ["A blue car", "A blue bike", "A red bike", "bluebells are cool."] result = [item for item in list1 if any(wordNeeded.lower() == x.lower() for x in item.split())] print(result) # ['A blue car', 'A blue bike']

Or use a regular expression with word boundaries altogether:

 import re rx = re.compile(r'\b{}\b'.format(wordNeeded), flags=re.I) result = [item for item in list1 if rx.search(item)] print(result)

You can use a for loop like this:

for (word in list1):
  if (wordNeeded in item):
      ...

The actual word search is pretty simple and has been discussed plenty of time:

Python - Check If Word Is In A String

https://www.geeksforgeeks.org/python-string-find/

def printList(list, word, list_size): 
    map = [0] * NO_OF_CHARS 

    for i in word: 
        map[ord(i)] = 1


    word_size = len(word) 
    for i in list: 
        count = 0
        for j in i: 
            if map[ord(j)]: 
                count+=1

                map[ord(j)] = 0
        if count==word_size: 
            print i 

        # Set the values in map for next item 
        for j in xrange(len(word)): 
            map[ord(word[j])] = 1
printList(list1, wordNeeded, len(list1))

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM