简体   繁体   中英

categorize sentence based on words in sentence for multiple sentences

data = ["my web portal is not working","online is better than offline", "i like going to pharmacy shop for medicines"]
words = ["web", "online"]

I want to iterate over sentences and check if any of words present in list words. If yes I want a single category for each sentence else category "other". It is working if I am giving single word from the words list, but I want to check all words in single run.

b = []
def ch_1(x,y):
    for i in x:
        if y in i:
            b.append("web")
        else:
            b.append("others")
    return b

Getting error :

in ' requires string as left operand, not list

You need to loop through both the lists given as parameters.

def ch_1(x,y):
    b = []
    for i in x:
        for j in y:
            if j in i:
                b.append('web')
                break
        else:
            b.append('others')
    return b
print(ch_1(data, words))

Output

['web', 'web', 'others']

Using in operator to check string 'contains' substring.

data = ["my web portal is not working","online is better than offline", "i like going to pharmacy shop for medicines"]
words = ["web", "online"]

def ch_1(x,y):
    b = []
    for i in x:
        web = False
        for j in y:
            if j in i:
                web = True
                break
        if web:
            b.append("web")
        else:
            b.append("others")
    return b

print(ch_1(data,words))

O/P:

['web', 'web', 'others']

This code is suitable for any number of words in words and sentences in data :

data = [
    "my web portal is not working",
    "online is better than offline",
    "i like going to pharmacy shop for medicines"
]

words = ["web", "online"]


def ch_1(words, data):
    categories = {sentence: [] for sentence in data}
    for sentence in data:
        for word in words:
            if word in sentence:  # and categories[sentence] == [] ((if you want exactly one category for each sentence))
                categories[sentence].append(word)
    for sentence in categories:
        if categories[sentence] == []:
            categories[sentence].append('others')
    return categories

print(ch_1(words, data))
 { 'i like going to pharmacy shop for medicines': ['others'], 'online is better than offline': ['online'], 'my web portal is not working': ['web'] } 

In your statement

if y in i:

y is a list. You don't show how you called ch_1, but I'm assuming you used ch_1(data, words). The parameter y is therefore ["web", "online"], and you are trying to find the whole list in i, which is a string. So you get the message

TypeError: 'in <string>' requires string as left operand, not list

because it expect y to be a string to find in string i. Providing it with a list doesn't make sense. If you used y[0] in i, or y[1] in i, then you would be correctly providing a string to find in i.

Try form of [conditional expression for sentence in data] :

data = [
    "my web portal is not working",
    "online is better than offline",
    "i like going to pharmacy shop for medicines",
]
words = ["web", "online"]

["web" if any(word in sentence for word in words) else "others" for sentence in data]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM