简体   繁体   中英

Python - Check if there's only one element of multiple lists in a string

The following code allow me to check if there is only one element of the lists that is in ttext .

from itertools import product, chain
from string import punctuation

list1 = ['abra', 'hello', 'cfre']
list2 = ['dacc', 'ex', 'you', 'fboaf']
list3 = ['ihhio', 'oih', 'oihoihoo']

l = [list1, list2, list3]

def test(l, tt):
    counts = {word.strip(punctuation):0 for word in tt.split()}
    for word in chain(*product(*l)):
        if word in counts:
            counts[word] += 1
        if sum(v > 1 for v in counts.values())  > 1:
            return False
    return True

Output:

In [16]: ttext = 'hello my name is brian'
In [17]: test(l,ttext)
Out[17]: True
In [18]: ttext = 'hello how are you?'
In [19]: test(l,ttext)
Out[19]: False

Now, how can i do the same if i have space in the elements of the lists, "I have", "you are" and "he is"?

You could add a list comprehension that goes through and splits all the words:

def test(l, tt):
    counts = {word.strip(punctuation):0 for word in tt.split()}
    splitl = [[word for item in sublist for word in item.split(' ')] for sublist in l]
    for word in chain(*product(*splitl)):
        if word in counts:
            counts[word] += 1
        if sum(v > 1 for v in counts.values())  > 1:
            return False
    return True

You can simplify a lot by just concatenating the lists using '+' rather than having a list of lists. This code also words if the string has spaces in it.

import string

list1 = ['abra', 'hello', 'cfre']
list2 = ['dacc', 'ex', 'you', 'fboaf']
list3 = ['ihhio', 'oih', 'oihoihoo']

l = list1 + list2 + list3

def test(l, tt):
    count = 0
    for word in l:
        #set of all punctuation to exclude
        exclude = set(string.punctuation)
        #remove punctuation from word
        word = ''.join(ch for ch in word if ch not in exclude)
        if word in tt:
            count += 1
    if count > 1:
        return False
    else:
        return True

You may consider using sets for this kind of processing.

Here is a quick implementation :

from itertools import chain
from string import punctuation

list1 = ['abra', 'hello', 'cfre']
list2 = ['dacc', 'ex', 'you', 'fboaf']
list3 = ['ihhio', 'oih', 'oihoihoo']
l = list(chain(list1, list2, list3))

words = set(w.strip(punctuation) for word in l for w in word.split())  # 1

def test(words, text):
    text_words = set(word.strip(punctuation) for word in text.split())  # 2
    return len(words & text_words) == 1  # 3

Few comments:

  1. Double for-loop on intentions works, you get a list of the words. The set make sure each word is unique.
  2. Same thing on the input sentence
  3. Using set intersection to get all words in the sentence that are also in your search set. Then using the length of this set to see if there is only one.

You could just split all the list input by iterating through it. Something like:

words=[]

for list in l:
    for word in list:
        string=word.split()
        words.append(string)

Well, first, lets rewrite the function to be more natural:

from itertools import chain

def only_one_of(lists, sentence):
  found = None
  for item in chain(*lists):
    if item in sentence:
      if found: return False
      else: found = item
  return True if found not is None else False

This already works with your constrains as it is only looking for some string item being a substring of sentence . It does not matter if it includes spaces or not. But it may lead to unexpected results. Imagine:

list1 = ['abra', 'hello', 'cfre']
list2 = ['dacc', 'ex', 'you', 'fboaf']
list3 = ['ihhio', 'oih', 'oihoihoo']

l = [list1, list2, list3]

only_one_of(l, 'Cadabra')

This returns True because abra is a substring of Cadabra . If this is what you want, then you're done. But if not, you need to redefine what item in sentence really means. So, let's redefine our function:

def only_one_of(lists, sentence, is_in=lambda i, c: i in c):
  found = None
  for item in chain(*lists):
    if is_in(item, sentence):
      if found: return False
      else: found = item
  return True if found not is None else False

Now the last parameter expects to be a function to be applied to two strings that return True if the first is found in the second or False , elsewhere.

You usually want to check if the item is inside the sentence as a word (but a word that can contain spaces in the middle) so let's use regular expressions to do that:

import re
def inside(string, sentence):
  return re.search(r'\b%s\b' % string, sentence)

This function returns True when string is in sentence but considering string as a word (the special sequence \\b in regular expression stands for word boundary ).

So, the following code should pass your constrains:

import re
from itertools import chain

def inside(string, sentence):
  return re.search(r'\b%s\b' % string, sentence)

def only_one_of(lists, sentence, is_in=lambda i, c: i in c):
  found = None
  for item in chain(*lists):
    if is_in(item, sentence):
      if found: return False
      else: found = item
  return True if found not is None else False

list1 = ['abra', 'hello', 'cfre']
list2 = ['dacc', 'ex', 'you', 'fboaf']
list3 = ['ihhio', 'oih', 'oihoihoo']
list4 = ['I have', 'you are', 'he is']

l = [list1, list2, list3, list4]

only_one_of(l, 'hello my name is brian', inside) # True
only_one_of(l, 'hello how are you?', inside) # False
only_one_of(l, 'Cadabra', inside) # False
only_one_of(l, 'I have a sister', inside) # True
only_one_of(l, 'he is my ex-boyfriend', inside) # False, ex and boyfriend are two words
only_one_of(l, 'he is my exboyfriend', inside) # True, exboyfriend is only one word

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM