简体   繁体   English

如何在 Python 中读取 CSV 文件?

[英]How to read CSV file in Python?

I'm using Spyder for Python 2.7 on Windows 8. I'm trying to open and read a csv file and see all the data stored in it, but this is what I get instead:我在 Windows 8 上使用 Spyder for Python 2.7。我试图打开并读取一个 csv 文件并查看其中存储的所有数据,但这是我得到的:

runfile('C:/Users/John/Documents/Python Scripts/FLInsuraneFile.py', wdir='C:/Users/John/Documents/Python Scripts')
<_io.TextIOWrapper name='FL_insurance_sample.csv' mode='r' encoding='cp1252'>

How can I open the file properly?如何正确打开文件?

You can use builtin library您可以使用内置库

import csv
with open('names.csv') as csvfile:
    reader = csv.DictReader(csvfile)
    for row in reader:
        print(row['first_name'], row['last_name'])

https://docs.python.org/3.5/library/csv.html https://docs.python.org/3.5/library/csv.html

You can use the pandas library:您可以使用pandas库:

import pandas as pd
csvfile = pd.read_csv('path_to_file')
print(csvfile)

If you want to add custom headers to the file use the names argument otherwise it will just take the first row of the file as the header.如果要将自定义标题添加到文件中,请使用names参数,否则它将仅将文件的第一行作为标题。

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html http://pandas.pydata.org/pandas-docs/stable/generated/pandas.read_csv.html

First things first, you must understand the inner-workings of a CSV file.首先,您必须了解 CSV 文件的内部工作原理。 CSV file are made up of rows and columns, like this: CSV 文件由行和列组成,如下所示:

| NAME  |  AGE |  ROOM |
| ---------------------|
| Kaleb |  15  |   256 |
| ---------------------|
| John  |  15  |   257 |
| ---------------------|
| Anna  |  16  |   269 |

Where the vertical elements are columns, and the horizontal elements are rows.其中垂直元素是列,水平元素是行。 Rows contain many types of data, like name/age/room.行包含多种类型的数据,例如姓名/年龄/房间。 Columns contain only one type of data, like name.列只包含一种类型的数据,如名称。

Moving on, here is an example function to read the CSV.继续,这是一个读取 CSV 的示例函数。 Please carefully study the code.请仔细研究代码。

def read_csv(csv_file):
    data = []
    with open(csv_file, 'r') as f:

        # create a list of rows in the CSV file
        rows = f.readlines()

        # strip white-space and newlines
        rows = list(map(lambda x:x.strip(), rows))

        for row in rows:

            # further split each row into columns assuming delimiter is comma 
            row = row.split(',')

            # append to data-frame our new row-object with columns
            data.append(row)

    return data

Now why would I do that?现在我为什么要这样做? Well, this function allows you to access your CSV file by row/column.好吧,此功能允许您按行/列访问您的 CSV 文件。 Meaning it is easier to index.这意味着它更容易索引。 Look at this example using the above function:使用上述函数查看此示例:

csvFile = 'test.csv'

# invoke our function 
data = read_csv(csvFile)

# get row 1, column 2 of file
print(data[1][2])

# get entirety of row 2
print(data[2])

# get row 0, columns 1 & 2
print(data[0][1], data[0][2])

As you can see, we can easily access different parts of the file by using our read_csv() function and creating a nested-list object.如您所见,我们可以通过使用read_csv()函数并创建嵌套列表对象轻松访问文件的不同部分。 Finally, if you want to print to the entire file, you simply use a for loop after creating the data-object.最后,如果要打印到整个文件,只需在创建数据对象后使用 for 循环即可。

data = read_csv(csvFile)

for row in data:
    print(row)

In conclusion, Pandas is great for big-data science, but if you just want to read/access the CSV, this function is just fine.总而言之,Pandas 非常适合大数据科学,但如果您只想读取/访问 CSV,这个功能就很好了。 No need to install big packages for little tasks, unless of course you want to :) .无需为小任务安装大包,除非您当然想:)。

Good luck!祝你好运!

You can use Table Base.您可以使用表库。

import tablebase

My_Table = tablebase.CsvTable("path/to/your.csv")
print(My_Table.table_content)

For full documentation of Table Base see python.centillionware.com/tablebase有关 Table Base 的完整文档,请参阅python.centillionware.com/tablebase

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM