简体   繁体   English

在包含大数据的 csv 文件上创建用户输入过滤器

[英]Creating a user-input filters on csv file that contains large data

I have a program that open and read a file in csv format that contains large data such as:我有一个程序可以打开并读取包含大量数据的 csv 格式文件,例如:

State      Crime type Occurrences Year 

CALIFORNIA ROBBERY    12          1999
CALIFORNIA ASSAULT    45          2003
NEW YORK   ARSON      9           1999
CALIFORNIA ARSON      21          2000
TEXAS      THEFT      30          2000
OREGON     ASSAULT    10          2001

I need to create 3 filters by user input.我需要通过用户输入创建 3 个过滤器。 For example:例如:

Enter State:
Enter Crime Type:
Enter Year:

If I enter:如果我输入:

Enter State: CALIFORNIA
Enter Crime: ASSAULT
Enter Year:  2003

Crime Report
State      Crime type Occurrences Year
CALIFORNIA ASSAULT    45          2003

This needs to happen.这需要发生。

I have no clue on how to tackle this problem.. I was only able to open and read the data file in csv format into a table in Python that will just print out every line.我不知道如何解决这个问题..我只能打开 csv 格式的数据文件并将其读取到 Python 中的表格中,该表格只会打印出每一行。 However, I need to incorporate search filter to narrow the result such as shown above.但是,我需要合并搜索过滤器来缩小结果,如上所示。 Anyone familiar with this?有熟悉这个的吗? Thank you all for your help.谢谢大家的帮助。

The Pandas library in Python allows you to view and manipulate csv data. Python 中的 Pandas 库允许您查看和操作 csv 数据。 The following solution imports the pandas library, reads the csv using the read_csv() function and loads it into a dataframe, then ask for input values, keeping in mind that State and Crime should be string values and cast as str and Year should be integer and cast as int , then applies a simple query to filter the results you need from the dataframe.以下解决方案导入 pandas 库,使用read_csv()函数读取 csv 并将其加载到数据帧中,然后要求输入值,记住 State 和 Crime 应该是字符串值并read_csv()转换为str和 Year 应该是整数并转换为int ,然后应用一个简单的查询来从数据框中过滤您需要的结果。 We build this query keeping in mind that all three conditions should be met and that the input strings can be lowercase too.我们构建此查询时牢记应满足所有三个条件,并且输入字符串也可以是小写的。

In [125]: import pandas as pd
In [126]: df = pd.read_csv('test.csv')

In [127]: df
Out[127]:
        State Crime type  Occurrences  Year
0  CALIFORNIA    ROBBERY           12  1999
1  CALIFORNIA    ASSAULT           45  2003
2    NEW YORK      ARSON            9  1999

In [128]: state = str(input("Enter State: "))
Enter State: California

In [129]: crime_type = str(input("Enter Crime Type: "))
Enter Crime Type: robbery

In [130]: year = int(input("Enter Year: "))
Enter Year: 1999

In [131]: df.loc[lambda x:(x['State'].str.lower().str.contains(state.lower()))
     ...: & (x['Crime type'].str.lower().str.contains(crime_type.lower())) & (x
     ...: ['Year'] == year)]
Out[131]:
        State Crime type  Occurrences  Year
0  CALIFORNIA    ROBBERY           12  1999

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM