as the title says I am trying to get a exact match from any list of strings in a lists. I'm finding it hard to explain so ill show code now.
List = [['BOB','27','male'],['SUE','32','female'],['TOM','28','unsure']]
This would be an example of the lists layout, then i want to send information through from a web scrape to see if anything matches any of the item[0]+item[1]+item[2] in the list, the problem i am having is that the web scrape is using a for argument:-
HTML = requests.get(url).content
match = re.compile('Name"(.+?)".+?Age"(.+?)".+?Sex"(.+?)"').findall(HTML)
for name,age,sex in match:
Then my next part also using a for argument:-
for item in List:
if item[0] == name and item[1] == age and item[2] == sex:
pass
else:
print 'Name = '+name
print 'Age = '+age
print 'Sex = '+sex
But obviously if the result matches any of the single sets of lists it cannot match the other 2 so it will not pass, is there a way i can achieve it to check to see if it matches anything set of 3 results in the list name,age,and sex being item[0],item[1],item[2] exactly? I have also tried:
if all(item[0] == name and item[1] == age and item[2] == sex for item in List):
pass
This does not work, I'm assuming its because its not a direct match in all the lists of list and if i change all to any i get results coming back that skip if any of the strings match, ie age is 27,32 or 28. I know my regex is poor form and not the ideal way to parse HTML but its all I can use confidently at the moment sorry. Full code below for easier reading.
List = [['BOB','27','male'],['SUE','32','female'],['TOM','28','unsure']]
HTML = requests.get(url).content
match = re.compile('Name"(.+?)".+?Age"(.+?)".+?Sex"(.+?)"').findall(HTML)
for name,age,sex in match:
for item in List:
if item[0] == name and item[1] == age and item[2] == sex:
pass
else:
print 'Name = '+name
print 'Age = '+age
print 'Sex = '+sex
Any help would be greatly appreciated, I am still a beginner and have not used forum's much so I will apologise in advance if it's not grammatically correct or I have asked in the wrong way.
re.findall
returns tuples, so you can simplify the comparison if the items in your list match the return type:
import re
# Changed sub-lists to tuples.
items = [('BOB','27','male'),('SUE','32','female'),('TOM','28','unsure')]
html = '''\
Name"BOB" Age"27" Sex"male"
Name"PAT" Age"19" Sex"unsure"
Name"SUE" Age"31" Sex"female"
Name"TOM" Age"28" Sex"unsure"
'''
for item in re.findall('Name"(.+?)".+?Age"(.+?)".+?Sex"(.+?)"', html):
if item in items:
name,age,sex = item
print 'Name =', name
print 'Age =', age
print 'Sex =', sex
print
Output:
Name = BOB
Age = 27
Sex = male
Name = TOM
Age = 28
Sex = unsure
You can also use item not in items
if you want the ones that don't match.
First change the name of the list. List
is not a reserved keyword,but it's not a good practice to use abstract names. My suggestion is to make the data a list. If I understood right your question, it's a matter of getting everything differently. So:
for sublist in my_list:
if (sublist[0] != weblist[0]) and (sublist[1] != weblist[1]) and (sublist[2] != weblist[2]):
print("List is different")
The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.