在Python中，如何找到CSV文件中信息的位置？

Question

我有三个非常长的CSV文件，并且在处理代码时需要一些建议/帮助。 基本上，我希望程序足够广泛/基本，可以添加任何限制，并且可以正常运行。

例如，如果我想设置代码以查找列1 == x和列2 == y的位置，那么我想让代码也能在列1！= r和列2的情况下工作

import csv
file = input('csv files: ').split(',')
filters = input('Enter the filters: ').split(',')
f = open(csv_file,'r')
p=csv.reader(f)
header_eliminator = next(p,[])

我遇到了“文件”部分的问题，因为如果我选择仅使用一个文件而不是现在要使用的三个文件，它将无法正常工作。 过滤器也是如此。 过滤器可能像4 == 10,5> = 4

这意味着文件的第4列等于10，文件的第5列等于或大于4。但是，我也可能希望过滤器看起来像这样：1 == 4.333，5 == “ 6/1/2014 0:00:00”，6 <= 60.0，7！= 6

因此，我希望能够将其用于其他用途！ 我对此有很多麻烦，您对如何开始有任何建议吗？ 谢谢！

Answer 1

熊猫非常适合处理csv文件。 我建议安装它。 pip install pandas

然后，如果您想读取打开的3个csv文件并在列上进行检查。 您只需要熟悉熊猫索引。 您现在需要知道的唯一方法是.iloc因为似乎您正在使用列的整数位置建立索引。

import pandas as pd

files = input('Enter the csv files: ').split(',')
data = []
#keeping a list of the files allows us to input a different number of files
#we use pandas to read in each file into a pandas dataframe which is then     stored in an element of the list. The length of the list is the number of files.
for names in files:
    data.append(pd.read_csv(names)

#You can then perform checks like this to see if the column 2 of all files are equal to 3
print all(i.iloc[:,2] == 3 for i in data)

Answer 2

您可以编写一个生成器，该生成器将使用一堆文件名并csv.reader输出行，并将其输入到csv.reader 。 棘手的部分是过滤器。 如果让过滤器成为一行python代码，则可以将eval用于该部分。 举个例子

import csv

#filenames = input('csv files: ').split(',')
#filters = input('Enter the filters: ').split(',')

# todo: for debug
# in this implementation, filters is a single python expression that can
# reference the 'col' variable which is a list of the current columns
filenames = 'a.csv,b.csv,c.csv'
filters = '"a" in col[0] and "2" in col[2]'

# todo: debug generate test files
for name in 'abc':
    with open('{}.csv'.format(name), 'w') as fp:
        fp.write('the header row\n')
        for row in range(3):
            fp.write(','.join('{}{}{}'.format(name, row, col) for col in range(3)) + '\n')

def header_squash(filenames):
    """Iterate multiple files line by line after squashing header line
    and any empty lines.
    """
    for filename in filenames:
        with open(filename) as fp:
            next(fp)
            for line in fp:
                if line.strip():
                    yield line

for col in csv.reader(header_squash(filenames.split(','))):
    # eval's namespace limits the damage untrusted code can do...
    if eval(filters, { 'col':col }):
        # passed the filter, do the work
        print(col)

在Python中，如何找到CSV文件中信息的位置？

问题描述

2 个解决方案

解决方案1
0 2015-12-06 05:34:00

解决方案2
0 2015-12-06 06:14:49

在Python中，如何找到CSV文件中信息的位置？

问题描述

2 个解决方案

解决方案1 0 2015-12-06 05:34:00

解决方案2 0 2015-12-06 06:14:49

解决方案1
0 2015-12-06 05:34:00

解决方案2
0 2015-12-06 06:14:49