在Python中使用过滤器功能

Question

I am trying to use Python's built-in filter function in order to extract data from certain columns in a CSV. 我试图使用Python的内置过滤器功能从CSV的某些列中提取数据。 Is this a good use of the filter function? 这是过滤功能的好用法吗？ Would I have to define the data in these columns first, or would Python somehow already know which columns contain what data? 我必须先在这些列中定义数据，还是Python以某种方式已经知道哪些列包含哪些数据？

Answer 1

Since python boasted "batteries included", for most the everyday situations, someone might already provided a solution. 由于python吹嘘“包括电池”，因此对于大多数日常情况，有人可能已经提供了解决方案。 CSV is one of them, there is built-in csv module CSV是其中之一，内置csv模块

Also tablib is a very good 3rd-party module especially you're dealing with non-ascii data. tablib也是一个非常好的第三方模块，尤其是在处理非ASCII数据时。

For the behaviour you described in the comment, this will do: 对于您在评论中描述的行为，它将执行以下操作：

import csv
with open('some.csv', 'rb') as f:
   reader = csv.reader(f)
   for row in reader:
      row.pop(1)
      print ", ".join(row)

Answer 2

The filter function is intended to select from a list (or in general, any iterable) those elements which satisfy a certain condition. filter功能旨在从列表（或一般而言，任何可迭代的）中选择满足特定条件的那些元素。 It's not really intended for index-based selection. 它并不是真正针对基于索引的选择。 So although you could use it to pick out specified columns of a CSV file, I wouldn't recommend it. 因此，尽管您可以使用它来挑选CSV文件的指定列，但我不建议这样做。 Instead you should probably use something like this: 相反，您可能应该使用如下所示的内容：

with open(filename, 'rb') as f:
    for record in csv.reader(f):
        do_something_with(record[0], record[2])

Depending on what exactly you are doing with the records, it may be better to create an iterator over the columns of interest: 根据您对记录的确切操作，最好在感兴趣的列上创建一个迭代器：

with open(filename, 'rb') as f:
    the_iterator = ((record[0], record[2]) for record in csv.reader(f))
    # do something with the iterator

or, if you need non-sequential processing, perhaps a list: 或者，如果您需要非顺序处理，则可以列出：

with open(filename, 'rb') as f:
    the_list = [(record[0], record[2]) for record in csv.reader(f)]
    # do something with the list

I'm not sure what you mean by defining the data in the columns. 我不确定在列中定义数据是什么意思。 The data are defined by the CSV file. 数据由CSV文件定义。

By comparison, here's a case in which you would want to use filter : suppose your CSV file contains numeric data, and you need to build a list of the records in which the numbers are in strictly increasing order within the row. 相比之下，在这种情况下，您需要使用filter ：假设您的CSV文件包含数字数据，并且您需要构建一个记录列表，其中该行中的数字严格按升序排列。 You could write a function to determine whether a list of numbers is in strictly increasing order: 您可以编写一个函数来确定数字列表是否严格按照升序排列：

def strictly_increasing(fields):
    return all(int(i) < int(j) for i,j in pairwise(fields))

(see the itertools documentation for a definition of pairwise ). （见itertools文档对的定义pairwise ）。 Then you can use this as the condition in filter : 然后，您可以将其用作filter的条件：

with open(filename, 'rb') as f:
    the_list = filter(strictly_increasing, csv.reader(f))
    # do something with the list

Of course, the same thing could, and usually would, be implemented as a list comprehension: 当然，同一件事可以并且通常将被实现为列表理解：

with open(filename, 'rb') as f:
    the_list = [record for record in csv.reader(f) if strictly_increasing(record)]
    # do something with the list

so there's little reason to use filter in practice. 因此几乎没有理由在实践中使用filter 。

在Python中使用过滤器功能

问题描述

2 个解决方案

解决方案1
7 2011-11-28 05:02:36

解决方案2
2 已采纳 2011-11-28 10:05:09

在Python中使用过滤器功能

问题描述

2 个解决方案

解决方案1 7 2011-11-28 05:02:36

解决方案2 2 已采纳 2011-11-28 10:05:09

解决方案1
7 2011-11-28 05:02:36

解决方案2
2 已采纳 2011-11-28 10:05:09