简体   繁体   English

如何使用多个搜索条件搜索CSV文件并打印行?

[英]How to search CSV file with multiple search criteria and print row?

I have a .csv file with about 1000 rows which looks like: 我有一个约有1000行的.csv文件,如下所示:

id,first_name,last_name,email,gender,ip_address,birthday
1,Ced,Begwell,cbegwell0@google.ca,Male,134.107.135.233,17/10/1978
2,Nataline,Cheatle,ncheatle1@msn.com,Female,189.106.181.194,26/06/1989
3,Laverna,Hamlen,lhamlen2@dot.gov,Female,52.165.62.174,24/04/1990
4,Gawen,Gillfillan,ggillfillan3@hp.com,Male,83.249.190.232,31/10/1984
5,Syd,Gilfether,sgilfether4@china.com.cn,Male,180.153.199.106,11/07/1995

What I have for code so far will ask for input, then go over each row and print the row if it contains the input. 到目前为止,我所拥有的代码将要求输入,然后遍历每行并打印该行(如果包含输入)。 Looks like so: 看起来像这样:

import csv

# Asks for search criteria from user

search = input("Enter search criteria:\n")

# Opens csv data file

file = csv.reader(open("MOCK_DATA.csv"))

# Go over each row and print it if it contains user input.

for row in file:
    if search in row:
        print(row)

What I want for end result, and what I'm stuck on, is to be able to enter more that one search criteria seperated by a "," and it will search and print those rows. 我想要的最终结果以及我要坚持的是,能够输入多个以“,”分隔的搜索条件,它将搜索并打印这些行。 Kind of like a way to filter the list. 有点像一种过滤列表的方法。

for expample if there was multiple "David" that are "Male" in the file. 例如,如果文件中有多个“男”的“大卫”。 I could enter : David, Male 我可以输入:David,男

It would then print all the rows that match but ignore those with a "David" thats is "Female". 然后,它将打印所有匹配的行,但忽略带有“ David”(即“ Female”)的行。

You can split the input on the comma then check to make sure each field from the input is present on a given line using all() and list comprehensions. 您可以使用逗号分割输入,然后使用all()和list comprehensions检查以确保输入的每个字段都出现在给定的行上。

This example uses a simplistic splitting of the input, and doesn't care which field each input matches. 本示例使用输入的简单拆分,并且不在乎每个输入匹配哪个字段。 If you want to only match to specific columns, look into using csv.DictReader instead of csv.reader . 如果只想匹配特定的列,请使用csv.DictReader而不是csv.reader

import csv
# Asks for search criteria from user
search_parts = input("Enter search criteria:\n").split(",")
# Opens csv data file
file = csv.reader(open("MOCK_DATA.csv"))
# Go over each row and print it if it contains user input.
for row in file:
    if all([x in row for x in search_parts]):
        print(row)

If you are happy to use a 3rd party library, this is possible with pandas . 如果您愿意使用第三方库,可以使用pandas

I have modified your data slightly to demonstrate a simple query. 我已经稍微修改了您的数据以演示一个简单的查询。

import pandas as pd
from io import StringIO

mystr = StringIO("""id,first_name,last_name,email,gender,ip_address,birthday
1,Ced,Begwell,cbegwell0@google.ca,Male,134.107.135.233,17/10/1978
2,Nataline,Cheatle,ncheatle1@msn.com,Female,189.106.181.194,26/06/1989
3,Laverna,Hamlen,lhamlen2@dot.gov,Female,52.165.62.174,24/04/1990
4,David,Gillfillan,ggillfillan3@hp.com,Male,83.249.190.232,31/10/1984
5,David,Gilfether,sgilfether4@china.com.cn,Male,180.153.199.106,11/07/1995""")

# replace mystr with 'file.csv'
df = pd.read_csv(mystr)

# retrieve user inputs
first_name = input('Input a first name\n:')
gender = input('Input a gender, Male or Female\n:')

# calculate Boolean mask
mask = (df['first_name'] == first_name) & (df['gender'] == gender)

# apply mask to result
res = df[mask]

print(res)

#    id first_name   last_name                     email gender  \
# 3   4      David  Gillfillan       ggillfillan3@hp.com   Male   
# 4   5      David   Gilfether  sgilfether4@china.com.cn   Male   

#         ip_address    birthday  
# 3   83.249.190.232  31/10/1984  
# 4  180.153.199.106  11/07/1995  

While you could just check if the strings "David" and "Male" exist in a row, it would not be very precise should you need to check column values. 尽管您可以仅检查字符串"David""Male"连续存在,但是如果您需要检查列值,则不是很精确。 Instead, read in the data via csv and create a list of namedtuple objects that store the search value and header name: 相反,请通过csv读取数据并创建一个namedtuple对象列表,这些对象存储搜索值和标头名称:

from collections import namedtuple
import csv
data = list(csv.reader(open('filename.csv')))
search = namedtuple('search', 'value,header')
searches = [search(i, data[0].index(b)) for i, b in zip(input().split(', '), ['first_name', 'gender'])]
final_results = [i for i in data if all(c.value == i[c.header] for c in searches)]

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM