简体   繁体   中英

Removing items from list - Python

I am scraping data from local jail site. I am trying to remove all the elements from a list except for the charges. I want all the statutes, bond, etc, gone.

Here is what I have tried:

charges = [[], ['13A-12-214.1'], ["ECSO (ETOWAH COUNTY SHERIFF\\'S OFFICE)"], ['SALVIA MISD POSS'], [''], ['M'], ['$1000.00'], [], [], ['13A-10-41'], ["ECSO (ETOWAH COUNTY SHERIFF\\'S OFFICE)"], ['RESISTING ARREST'], [''], ['M'], ['$1000.00'], [], [], ['32.5A.88'], ["ECSO (ETOWAH COUNTY SHERIFF\\'S OFFICE)"], ['IMPROPER LANE USAGE'], [''], ['U'], ['$500.00'], [], [], [''], [''], ['DET FOR COMM CORR'], [''], ['U'], ['$0.00'], [], [], ['<tr>\\r\\n\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t        <td class="SearchHeader" colspan="2">']]

    for string in charges:
        if string == arrestedBy:
            charges.remove(string)
        elif string.isalpha() == False:
            charges.remove(string)
        elif len(string) < 2:
            charges.remove(string)

if charges[-1] == '<tr>\\r\\n\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t\\t        <td class="SearchHeader" colspan="2">':
    charges.remove(charges[-1])

charges = filter(None, charges)

charges = str(charges)

What I get instead is:

"ECSO (ETOWAH COUNTY SHERIFF\\S OFFICE)", $1000.00, "ECSO (ETOWAH COUNTY SHERIFF\\S OFFICE)", $1000.00, "ECSO (ETOWAH COUNTY SHERIFF\\S OFFICE)", $500.00, $0.00

What I want is:

"SALVIA MISD POSS, RESISTING ARREST, IMPROPER LANE USAGE, DET FOR COMM CORR"

If you can't limit what you are getting to charges when you are scraping, consider, rather than iterating over the list and deleting elements as you go (which is inadvisable), using python list comprehension.

For example, if you define some function is_charge that contains your logic for defining a charge and returns a boolean:

charges = [i for i in charges if is_charge(i)]

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM