简体   繁体   中英

loop through two lists to find if an element from 1st list exists in the 2nd list

I've got two lists. One is a list of languages, and the second is a list of strings. I would like to search if any language exists in the list of text and append it (the found language) to a new list, otherwise, append "english" to that new list instead.

languages = ['afrikaans', 'russian', 'amharic', 'japanese', 'armenian', 'polish', ...]

texts = ['apple', 'orange in polish', 'grape in russian']

The desired output:

['english', 'polish', 'russian']

I first tried these lines but it returns ['polish', 'russian'] !

list_of_valid_langs = []
for lang in langs:
    for text in texts:
        if lang in text:
            list_of_valid_langs.append(lang)

For my second attempt, I added a second condition but this is not what I need

list_of_valid_langs = []
for lang in langs:
    for text in texts:
        if lang in text:
            list_of_valid_langs.append(lang)
        elif lang not in text:
            list_of_valid_langs.append('english')

I think your mistake is first iterating over the langueges, and then iterating over the texts. Let's try flipping it around:

for text in texts:
    for lang in langs:
        if lang in text:
            list_of_valid_langs.append(lang)
            break  # lang is found, no need to keep searching
    else:  # if no lang was found, append 'english'
        list_of_valid_langs.append('english')

After seeing @fsimonjetz's answer, I found a better sulotion using sets:

# first of all, turn langs into a set
langs = set(langs)

# iterate over the texts
for text in texts:
    # check if one of the words in the text is a language
    for word in text.split():
        if word in langs:
            # if a language is found, append it and break
            list_of_valid_langs.append(word)
            break
    else:
        # if no language is found, append 'english'
        list_of_valid_langs.append('english')

A note about for else : The code in the for loop runs as usual, but the code in the else block only runs if the for loop exited normally. Another way to look at this is that the else block only runs if the break statement wasn't reached.
If you want, you can replace the for else with a normal for loop followed by an if block by using a bool variable.

I think Roy Cohen's answer is the perfect solution to your problem, but I'd like to suggest an alternative using set intersection that would be more efficient:

languages = set(['afrikaans', 'russian', 'amharic', 'japanese', 'armenian', 'polish'])
texts = ['apple', 'orange in polish', 'grape in russian']

list_of_valid_langs = []

for t in texts:
    # this will return the set of the language(s) occurring in the
    # string if there are any, otherwise it returns {'english'}
    lang = set(t.split()).intersection(languages) or {'english'}

    # pop the element from the set and append to the list
    list_of_valid_langs.append(lang.pop())

This should work:

for text in texts:
    lang_found = False

    for lang in langs:
        if lang in text:
            list_of_valid_langs.append(lang)
            lang_found = True
        
    if not lang_found:
        list_of_valid_langs.append('english')

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM