[英]How to search CSV file with multiple search criteria and print row?
I have a .csv file with about 1000 rows which looks like: 我有一个约有1000行的.csv文件,如下所示:
id,first_name,last_name,email,gender,ip_address,birthday
1,Ced,Begwell,cbegwell0@google.ca,Male,134.107.135.233,17/10/1978
2,Nataline,Cheatle,ncheatle1@msn.com,Female,189.106.181.194,26/06/1989
3,Laverna,Hamlen,lhamlen2@dot.gov,Female,52.165.62.174,24/04/1990
4,Gawen,Gillfillan,ggillfillan3@hp.com,Male,83.249.190.232,31/10/1984
5,Syd,Gilfether,sgilfether4@china.com.cn,Male,180.153.199.106,11/07/1995
What I have for code so far will ask for input, then go over each row and print the row if it contains the input. 到目前为止,我所拥有的代码将要求输入,然后遍历每行并打印该行(如果包含输入)。 Looks like so:
看起来像这样:
import csv
# Asks for search criteria from user
search = input("Enter search criteria:\n")
# Opens csv data file
file = csv.reader(open("MOCK_DATA.csv"))
# Go over each row and print it if it contains user input.
for row in file:
if search in row:
print(row)
What I want for end result, and what I'm stuck on, is to be able to enter more that one search criteria seperated by a "," and it will search and print those rows. 我想要的最终结果以及我要坚持的是,能够输入多个以“,”分隔的搜索条件,它将搜索并打印这些行。 Kind of like a way to filter the list.
有点像一种过滤列表的方法。
for expample if there was multiple "David" that are "Male" in the file. 例如,如果文件中有多个“男”的“大卫”。 I could enter : David, Male
我可以输入:David,男
It would then print all the rows that match but ignore those with a "David" thats is "Female". 然后,它将打印所有匹配的行,但忽略带有“ David”(即“ Female”)的行。
You can split the input on the comma then check to make sure each field from the input is present on a given line using all()
and list comprehensions. 您可以使用逗号分割输入,然后使用
all()
和list comprehensions检查以确保输入的每个字段都出现在给定的行上。
This example uses a simplistic splitting of the input, and doesn't care which field each input matches. 本示例使用输入的简单拆分,并且不在乎每个输入匹配哪个字段。 If you want to only match to specific columns, look into using
csv.DictReader
instead of csv.reader
. 如果只想匹配特定的列,请使用
csv.DictReader
而不是csv.reader
。
import csv
# Asks for search criteria from user
search_parts = input("Enter search criteria:\n").split(",")
# Opens csv data file
file = csv.reader(open("MOCK_DATA.csv"))
# Go over each row and print it if it contains user input.
for row in file:
if all([x in row for x in search_parts]):
print(row)
If you are happy to use a 3rd party library, this is possible with pandas
. 如果您愿意使用第三方库,可以使用
pandas
。
I have modified your data slightly to demonstrate a simple query. 我已经稍微修改了您的数据以演示一个简单的查询。
import pandas as pd
from io import StringIO
mystr = StringIO("""id,first_name,last_name,email,gender,ip_address,birthday
1,Ced,Begwell,cbegwell0@google.ca,Male,134.107.135.233,17/10/1978
2,Nataline,Cheatle,ncheatle1@msn.com,Female,189.106.181.194,26/06/1989
3,Laverna,Hamlen,lhamlen2@dot.gov,Female,52.165.62.174,24/04/1990
4,David,Gillfillan,ggillfillan3@hp.com,Male,83.249.190.232,31/10/1984
5,David,Gilfether,sgilfether4@china.com.cn,Male,180.153.199.106,11/07/1995""")
# replace mystr with 'file.csv'
df = pd.read_csv(mystr)
# retrieve user inputs
first_name = input('Input a first name\n:')
gender = input('Input a gender, Male or Female\n:')
# calculate Boolean mask
mask = (df['first_name'] == first_name) & (df['gender'] == gender)
# apply mask to result
res = df[mask]
print(res)
# id first_name last_name email gender \
# 3 4 David Gillfillan ggillfillan3@hp.com Male
# 4 5 David Gilfether sgilfether4@china.com.cn Male
# ip_address birthday
# 3 83.249.190.232 31/10/1984
# 4 180.153.199.106 11/07/1995
While you could just check if the strings "David"
and "Male"
exist in a row, it would not be very precise should you need to check column values. 尽管您可以仅检查字符串
"David"
和"Male"
连续存在,但是如果您需要检查列值,则不是很精确。 Instead, read in the data via csv
and create a list of namedtuple
objects that store the search value and header name: 相反,请通过
csv
读取数据并创建一个namedtuple
对象列表,这些对象存储搜索值和标头名称:
from collections import namedtuple
import csv
data = list(csv.reader(open('filename.csv')))
search = namedtuple('search', 'value,header')
searches = [search(i, data[0].index(b)) for i, b in zip(input().split(', '), ['first_name', 'gender'])]
final_results = [i for i in data if all(c.value == i[c.header] for c in searches)]
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.