简体   繁体   中英

Python ignoring command to remove items from list

I'm filtering through a very large list of dictionaries. kept is the global list and it has about 9000 dictionaries within it and all the dictionaries have the same keys. I'm trying to remove every dictionary that has a 'M_P' values greater than -4.5 and there are more than half of them so I created a function solely for his purpose. WHen I check to see if they have all been removed in a later function, there are still ~3000 left. Can anybody tell me why that would be happening and can I trust that these functions will do what I am telling it to do?

def removeMag():

    countMag = 0
    for mag in kept:
        if to_float(mag['M_P']) > -4.5:
            kept.remove(mag)
            countMag += 1
        else:
            continue

    print '\n'
    print ' Number of mags > -4.5 actually removed: '
    print countMag

def remove_anomalies():    
    count = 0
    count08 = 0
    count09 = 0
    count01 = 0
    countMag = 0
    countMagDim = 0
    #Want to remove Q* < 15 degrees
    for row in kept:
        #to_float(kept(row))
        #Q* greater than 15
        if to_float(row['Q*']) < 15.00:
            kept.remove(row)
        elif to_float(row['vel']) > 80.00:
            kept.remove(row)
        elif to_float(row['err']) >= 0.5*to_float(row['vel']):
            kept.remove(row)

        elif row['log_10_m'] == '?':
            kept.remove(row)
            #print row
            count+=1
        elif row['M_P'] == '?':
            kept.remove(row)
            countMag += 1
        elif to_float(row['M_P']) > -4.5:
            countMagDim += 1

Right here is where I'm checking it. ^^^

        elif to_float(row['T_j']) < -50.00 or to_float(row['T_j'] >    50.00):
        kept.remove(row)
        count01 += 1

        #make sure beg height is above end height.
        elif to_float(row['H_beg']) < to_float(row['H_end']):
            kept.remove(row)
        #make sure zenith distance is not greater than 90
        elif to_float(row['eta_p']) > 90.00:
            kept.remove(row)
        #Remove extremities hyperbolic orbits    
        elif (to_float(row['e']) > 2.00 and to_float(row['e']) == 0.00 and to_float(row['a']) == 0.00 and to_float(row['incl']) == 0.00 and to_float(row['omega']) == 0.00 and to_float(row['anode']) == 0.00 and to_float(row['alp_g']) == 0.00 and to_float(row['del_g']) == 0.00 and to_float(row['lam_g']) == 0.00 and to_float(row['bet_g']) == 0.00):
            kept.remove(row)
            count08+=1
        elif to_float(row['q_per']) == 0.00:
            kept.remove(row)
            count09+=1
        elif to_float(row['q_aph']) == 0.00:
            kept.remove(row)
            count09+=1
        else: continue

    print 'Number of dicts with ? as mass value:'
    print count    

    print " Number removed with orbital elements condition: "
    print count08

    print "Number of per or aph equal to 0: "
    print count09

    print "Number of T_j anomalies: "
    print count01

    print "Number of Magnitudes removed from '?': "
    print countMag

The output for the following is like 3000.

    print "Number of Magnitudes to be removed from too dim: "
    print countMagDim   
'''    
    print "\n"
    print "log mass values:"
    for row2 in kept:
        print row2['log_10_mass']
    print "\n"
'''

When iterating using a for loop, Python doesn't automatically make a copy of the list, but iterates on it directly. So, when you remove an element, the loop won't take the change into account and will skip elements of the list.

Example:

>>> l = [1,2,3,4,5]
>>> for i in l: l.remove(i)
>>> l
[2, 4]

You can use an list indice as shorthand to make a copy of the list before iterating, for example:

>>> for i in l[:]: l.remove(i)
>>> l
[]

As others have said, you are modifying an array while iterating over it.

The simple one-liner for this would be

kept = [mag for mag in kept if to_float(mag['M_P']) <= -4.5]

Which simply keeps all the entries you are interested in, replacing the original list.

Calculating how many were removed is simply a matter of taking len(kept) before and after the comprehension and taking the difference.

Alternatively,

discarded = [mag for mag in kept if to_float(mag['M_P']) > -4.5]
kept = [mag for mag in kept if to_float(mag['M_P']) <= -4.5]

Splits the array without losing any information

You should never modify the sequence you are iterating over in a for loop. Looking just at your first function:

def removeMag():

    countMag = 0
    for mag in kept:
        if to_float(mag['M_P']) > -4.5:
            kept.remove(mag)
            countMag += 1

You are calling remove on kept in the loop. This leads to unspecified behaviour and anything can happen. See this question .

A simple way to solve this is to use a new list for the items to keep:

mag_to_keep = []
for mag in kept:
    if float(mag['M_P']) <= -4.5:
        mag_to_keep.append(mag)

kept = mag_to_keep

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM