简体   繁体   English

Python 读取 csv 文件并过滤数据

[英]Python read csv file and filter data

I appologize if this was already answered before, but I checked I bunch of posts and just cannot understand what is wrong with my code.如果之前已经回答过这个问题,我深表歉意,但是我检查了一堆帖子,只是无法理解我的代码有什么问题。 I'm trying to read a csv file in python (see bellow) and filter out rows of data by the value in the second column (angle).我正在尝试读取 python 中的 csv 文件(见下文),并通过第二列(角度)中的值过滤掉数据行。 Then I want to create a new output file with filtered time and angle values.然后我想用过滤的时间和角度值创建一个新的 output 文件。 I only get the output file with headers written in.我只得到了写有标题的 output 文件。

csv file: csv 文件:

time,angle
0,56
1,89
2,112
3,189
4,122
5,123

Code:代码:

import csv

#define the min and max value of angle
alpha_min = 110
alpha_max = 125

#read csv file and loop through with a filter
with open('test_csv.csv', 'r') as input_file:
    csv_reader = csv.reader(input_file)#, delimiter=',')
    #header = next(input_file).strip("\n").split(",")
    results = filter(lambda row: alpha_min<row[1]<alpha_max, csv_reader)

#create output file
with open('test_output_csv.csv', "w") as output_file:
    csv_writer = csv.writer(output_file, delimiter=',')
    csv_writer.writerow(header)
    for result in results:
        csv_writer.writerow(result)

I would suggest using pandas library for this workflow, which will be faster and more efficient than looping through each line of your csv file.我建议为此工作流程使用pandas 库,这将比循环遍历 csv 文件的每一行更快、更有效。 Something like the below:如下所示:

import pandas as pd

#define the min and max value of angle
alpha_min = 110
alpha_max = 125

# read input and filter angle data
df = pd.read_csv('test_csv.csv')
df = df[(df['angle'] < alpha_max) & (df['angle'] > alpha_min)]

# write output
df.to_csv('output.csv')

You can do你可以做

import csv

#define the min and max value of angle
alpha_min = 110
alpha_max = 125

#read csv file and loop through with a filter
with open('test_csv.csv', 'r') as input_file:
    csv_reader = csv.reader(input_file)#, delimiter=',')
    lines = [i for i in csv_reader]
    header = lines[0]
    results = filter(lambda row: alpha_min<int(row[1])<alpha_max, lines[1:])

#create output file
with open('test_output_csv.csv', "w", newline='') as output_file:
    csv_writer = csv.writer(output_file, delimiter=',')
    csv_writer.writerow(header) 
    csv_writer.writerows(results)

That will save to the file这将保存到文件

time,angle
2,112
4,122
5,123

The fields of a csv row are strings so you need int(row[1]) to work correctly. csv行的字段是字符串,因此您需要int(row[1])才能正常工作。 I also recommend a list comprehension for the filtering, or pandas for speed.我还建议使用列表推导来进行过滤,或者pandas来提高速度。 next(csv_reader) will read one row to capture the headers as well. next(csv_reader)也将读取一行以捕获标题。

Note: use newline='' with the csv module as documented to avoid blank lines between each row .注意:将newline=''csv模块一起使用,如文档所示,以避免每行之间出现空白行

import csv

alpha_min = 110
alpha_max = 125

with open('test.csv','r',newline='') as input_file:
    csv_reader = csv.reader(input_file)
    header = next(csv_reader)
    results = [row for row in csv_reader if alpha_min < int(row[1]) < alpha_max]

with open('output.csv','w',newline='') as output_file:
    csv_writer = csv.writer(output_file)
    csv_writer.writerow(header)
    csv_writer.writerows(results)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM