简体   繁体   中英

Compare list w/ sublist

I have 2 lists:

lista = ['1.2.3.4', '2.3.4.5', '3.4.5.6'] # 12,000 IP's
listb = [['1.2.3.4', 'info', 'moreinfo', 'moremoreinfo'], ['2.3.4.5', 'info', 'moreinfo', 'moremoreinfo']] # 1.8m IP's + info

I'm looking for a way to take lista, if the ip exsits in listb, give me all the info on it.

I've tried looping, but its incredibly slow:

for listaitem in lista:
    for listbitem in listb:
        if listaitem in listbitem[0]:
            print listbitem

Any suggetions on how to speed this up?

You could turn lista into a set for fast membership testing, then just loop over listb to select any that are found in lista :

lista_set = set(lista)
for item in listb:
    if item[0] in lista_set:
        print item

The next step is turning listb into a dictionary:

listb_dict = {item[0]: item[1:] for item in listb}

Now you can use sets to pick out just the ones that are both in lista_set and listb_dict :

for match in listb_dict.viewkeys() & lista_set:
    print match, listb_dict[match]
lista = ['1.2.3.4', '2.3.4.5', '3.4.5.6']
listb = [['1.2.3.4', 'info', 'moreinfo', 'moremoreinfo'],
         ['2.3.4.5', 'info', 'moreinfo', 'moremoreinfo']]

Turn listb into a dictionary

dictb = {i[0] : i[1:] for i in listb}

Iterate over lista and look for entries in dictb

for elem in lista:
    print dictb.get(elem)

['info', 'moreinfo', 'moremoreinfo']
['info', 'moreinfo', 'moremoreinfo']
None

You should convert the data to a format more suitable for searching: a dictionary.

ip_info = {info[0]: info[1:] for info in listb}

Then you can very quickly look up information about a particular IP.

for ip in lista:
    if ip in ip_info:
        print(ip_info[ip])

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM