简体   繁体   中英

How to check if a set of results exactly match any list of strings in a list in python

as the title says I am trying to get a exact match from any list of strings in a lists. I'm finding it hard to explain so ill show code now.

List = [['BOB','27','male'],['SUE','32','female'],['TOM','28','unsure']]

This would be an example of the lists layout, then i want to send information through from a web scrape to see if anything matches any of the item[0]+item[1]+item[2] in the list, the problem i am having is that the web scrape is using a for argument:-

HTML = requests.get(url).content
match = re.compile('Name"(.+?)".+?Age"(.+?)".+?Sex"(.+?)"').findall(HTML)
for name,age,sex in match:

Then my next part also using a for argument:-

    for item in List:
        if item[0] == name and item[1] == age and item[2] == sex:
            pass
        else:
            print 'Name = '+name
            print 'Age = '+age
            print 'Sex = '+sex

But obviously if the result matches any of the single sets of lists it cannot match the other 2 so it will not pass, is there a way i can achieve it to check to see if it matches anything set of 3 results in the list name,age,and sex being item[0],item[1],item[2] exactly? I have also tried:

if all(item[0] == name and item[1] == age and item[2] == sex for item in List):
    pass

This does not work, I'm assuming its because its not a direct match in all the lists of list and if i change all to any i get results coming back that skip if any of the strings match, ie age is 27,32 or 28. I know my regex is poor form and not the ideal way to parse HTML but its all I can use confidently at the moment sorry. Full code below for easier reading.

List = [['BOB','27','male'],['SUE','32','female'],['TOM','28','unsure']]
HTML = requests.get(url).content
match = re.compile('Name"(.+?)".+?Age"(.+?)".+?Sex"(.+?)"').findall(HTML)
for name,age,sex in match:
    for item in List:
        if item[0] == name and item[1] == age and item[2] == sex:
            pass
        else:
            print 'Name = '+name
            print 'Age = '+age
            print 'Sex = '+sex

Any help would be greatly appreciated, I am still a beginner and have not used forum's much so I will apologise in advance if it's not grammatically correct or I have asked in the wrong way.

re.findall returns tuples, so you can simplify the comparison if the items in your list match the return type:

import re

# Changed sub-lists to tuples.
items = [('BOB','27','male'),('SUE','32','female'),('TOM','28','unsure')]

html = '''\
Name"BOB" Age"27" Sex"male"
Name"PAT" Age"19" Sex"unsure"
Name"SUE" Age"31" Sex"female"
Name"TOM" Age"28" Sex"unsure"
'''

for item in re.findall('Name"(.+?)".+?Age"(.+?)".+?Sex"(.+?)"', html):
    if item in items:
        name,age,sex = item
        print 'Name =', name
        print 'Age =', age
        print 'Sex =', sex
        print

Output:

Name = BOB
Age = 27
Sex = male

Name = TOM
Age = 28
Sex = unsure

You can also use item not in items if you want the ones that don't match.

First change the name of the list. List is not a reserved keyword,but it's not a good practice to use abstract names. My suggestion is to make the data a list. If I understood right your question, it's a matter of getting everything differently. So:

for sublist in my_list:
    if (sublist[0] != weblist[0]) and (sublist[1] != weblist[1]) and (sublist[2] != weblist[2]):
        print("List is different")

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM