简体   繁体   English

如果列表中的项目在 python 中有某个字符串,我该如何提取它?

[英]How do I extract an item in a list if it has a certain string in python?

I have a list with many items in each entry delimited by [] .我有一个列表,每个条目中有许多项目,由[]分隔。 eg,例如,

['1', 'pbkdf2_sha256$100000$sk3ONL23432fsdgUsHM62xa9XJHL+LkJHhK3cFGj8LYWGtOd8HC7Hs=',
'2018-09-25 19:32:41', '0', '', '', 'bob@trellis.law', 'Bob', 'Simon', 
'bob@trellis.law', '1', '0', '2016-12-30 17:43:41', 'Bob Simon', 'Bob', '0', '1', 
'', '[]', '', '0', '1', '0', '1', '', '', '1', '14', '191', '1', '0', '1', '0', '', 
'', '', '0']

I want to find entries that contain this regex, and then capture the entire row in a variable:我想找到包含此正则表达式的条目,然后在变量中捕获行:

r = re.compile(r'\w+\+\d+@trellis\.law')

I have unsuccesfully tried:我没有成功地尝试过:

def import_csv(csv_file):
    name_entries = []
    with open(csv_file) as csvfile:
        reader = csv.reader(csvfile)
        name_entries.append(list(reader))
    return name_entries


def exclude_regex_users(name_entries):
    pulled_names = []
    r = re.compile(r'\w+\+\d+@trellis\.law')

    reader = csv.reader(name_entries)

    for read in reader:
        n = r.match(read)
        if n:
            pulled_names.append(n.group())

    print(pulled_names)

I get a _csv.Error: iterator should return strings, not list (did you open the file in text mode?) .我得到一个_csv.Error: iterator should return strings, not list (did you open the file in text mode?)

Argh.啊。

First, import_csv should not wrap the list in another list, it should just return the list of rows.首先, import_csv不应该将列表包装在另一个列表中,它应该只返回行列表。

def import_csv(csv_file):
    name_entries = []
    with open(csv_file) as csvfile:
        reader = csv.reader(csvfile)
        return list(reader)

Second, exclude_entries doesn't need to use csv , that was already used when the data was imported, and name_entries is the list of rows.其次, exclude_entries不需要使用csv ,在导入数据时已经使用, name_entries是行列表。

Third, you should only be matching against the list element that contains the email address.第三,您应该只匹配包含 email 地址的列表元素。

You can use filter() to filter the list, rather than a loop.您可以使用filter()过滤列表,而不是循环。

def exclude_regex_users(name_entries):
    r = re.compile(r'\w+\+\d+@trellis\.law')
    pulled_names = filter(lambda row: r.match(row[6]) or r.match(row[9]), name_entries)

    print(pulled_names)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM