简体   繁体   English

如何使用python从csv文件格式中读取数据

[英]How to read data from csv file format by using python

I want to read CSV file.我想读取 CSV 文件。

In the first column of.CSV file, I have a different variable (eg, N11, N12, N21, N22,..., N38).在.CSV 文件的第一列中,我有一个不同的变量(例如,N11、N​​12、N21、N22、...、N38)。 In the 3rd column, I have different values against each variable.在第三列中,我对每个变量都有不同的值。 The values in the 3rd column are randomly placed (Not in any sequence).第 3 列中的值是随机放置的(不按任何顺序排列)。

I want to get the minimum and maximum value against each variable (N11, N12...etc).我想获得每个变量(N11、N​​12...等)的最小值和最大值。 For example, N11 minimum = 1573231694 and N11 Maximum = 1573231738 is given in example data.例如,示例数据中给出了 N11 最小值 = 1573231694 和 N11 最大值 = 1573231738。

In .csv file, each variable contains thousand of tuples as shown below:在 .csv 文件中,每个变量包含数千个元组,如下所示:

在此处输入图片说明

I was trying the below code.我正在尝试下面的代码。 Can anybody help me to modify the below code according to the above requirement?有人可以帮我根据上述要求修改以下代码吗?

import csv
with open('example.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=',')
    for row in readCSV:
        print(row[2])

Thank you.谢谢你。

Working sample工作样本

N11,12,123,123,0
N21,12,133,123,0
N12,12,143,123,0
N32,12,125,123,0
N11,12,121,123,0
N12,12,121,123,0
N11,12,122,123,0
N21,12,127,123,0
N32,12,183,123,0
N14,12,193,123,0

Assumption假设

  • If the first columns only has one value it will be set as both the max and min如果第一列只有一个值,它将被设置为最大值和最小值

The code, commented with explanation代码,注释了解释

import csv

# Collect the pairs in a dict of lists where the first value is the minimum 
# and the second the max

min_max = dict() 

with open('example.csv') as csvfile:
    readCSV = csv.reader(csvfile, delimiter=',')

    for row in readCSV:

        # Check if the value already exists in the dict
        row_val = min_max.get(row[0]) 

        if row_val is not None:
            row_min = min_max[row[0]][0] # Get the min
            row_max = min_max[row[0]][1] # Get the max

            # Check against current value
            min_max[row[0]][0] = min(row[2], row_min) 
            min_max[row[0]][1] = max(row[2], row_max)   
        else:
            # If it doesn't exist update the dict
            min_max[row[0]] = [row[2], row[2]]

    print(min_max)

Output输出

{'N11': ['121', '123'], 'N21': ['127', '133'], 'N12': ['121', '143'], 'N32': ['125', '183'], 'N14': ['193', '193']}

I'd personally recommend you to use the pandas module.我个人建议您使用 pandas 模块。

You can easily create datastructures and easily manage databases.您可以轻松创建数据结构并轻松管理数据库。 It is an open source library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.它是一个开源库,为 Python 编程语言提供高性能、易于使用的数据结构和数据分析工具。

Check out the documentation for using pandas here: https://pandas.pydata.org/pandas-docs/stable/在此处查看使用熊猫的文档: https : //pandas.pydata.org/pandas-docs/stable/

And for this particular problem, you could just do:对于这个特定的问题,你可以这样做:

import pandas as pd
dataframe = pd.read_csv("*File Path Here*")

This creates a custom pandas datastructure named 'dataframe' (choice completely up to you obviously) whose data can be easily and efficiently manipulated这将创建一个名为“dataframe”的自定义 Pandas 数据结构(显然完全由您选择),其数据可以轻松有效地操作

df=pd.DataFrame({'col1':['N11','N12','N11','N14'],'col2':[1, 3, 5, 7],'col3':[11,13, 15, 17]})
print("N11 max=",df['col3'][df['col1']=='N11'].max())
print("N11 Min=",df['col3'][df['col1']=='N11'].min())

output: N11 max= 15 N11 Min= 11输出:N11 最大值= 15 N11 最小值= 11

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM