简体   繁体   中英

List mismatched words from two lists in Python

I want to print the words that are not present in either of the sentences and if one word is present twice in the first sentence but only once in the second sentence, I want to print that word too. For example:

a = "The winter winter season is cold"
b = "The summer winter season is hot"

Output: {'winter','cold','summer','hot'}

I tried to use Set in python but it gives me this output: {'hot', 'cold', 'summer'}

def uncommonwords(a,b):
    listA = a.split()
    listB = b.split()
    listC = listA +listB
    return set(listC) - set(listA).intersection(listB)
print(uncommonwords(a,b))

Whats happening is:

  1. you converting list to set, which leads to dropping of duplicate words. ( winter is occuring only once in listA)
  2. As winter is present in one set and not another, its not being displayed.

You need the additional list of words which are present twice in a and once in b . When you work with sets you lose the count of words in the sentence. Hence, you can use a dict instead:

dictA = {x:a.count(x) for x in a.split()}
dictB = {x:b.count(x) for x in b.split()}

[x for x in dictA.keys() if dictA[x]==2 and dictB[x]==1] 

Output

['winter']

You can compress the whole function into

[x for x in dictA.keys() if (x not in dictB.keys()) or (dictA[x]==2 and dictB[x]==1)] + [x for x in dictB.keys() if (x not in dictA.keys())]

Output

['winter', 'cold', 'summer', 'hot']

Simply use 'collections' package

from collections import Counter
a = "The winter winter season is cold" 
b = "The summer winter season is hot"

def uncommonwords(a,b):
  listA = Counter(a.split())
  listB = Counter(b.split())

  return list((listA - listB) + (listB - listA))

print(uncommonwords(a,b))

And your output will be

['winter', 'cold', 'summer', 'hot']

in this case, we are using the counter of each word and manipulate difference of both list and getting result

you could try this too.

a = "The winter winter season is cold"
b = "The summer winter season is hot"


def uncommonwords(a, b):
    a = a.split()
    b = b.split()
    h = set([x for x in a if a.count(x) > 1] + [x for x in b if b.count(x) > 1])
    k = set(a).symmetric_difference(set(b))
    return set.union(k, h)


print(uncommonwords(a, b))

or if you have iteration_utilities instead of set([x for x in a if a.count(x) > 1] + [x for x in b if b.count(x) you can do set(list(duplicates(a)) + list(duplicates(b))) too

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM