簡體   English   中英

在python中讀取csv文件

[英]Reading csv file in python

使用以下代碼段,使用以下數據,出現以下錯誤。 你能幫我這個忙嗎? 我是python的初學者。 資料:

"Id","Title","Body","Tags"
"Id1","Tit,le1","Body1","Ta,gs1"
"Id","Title","Body","Ta,2gs"

碼:

#!/usr/bin/python 
import csv,sys
if len(sys.argv) <> 3:
print >>sys.stderr, 'Wrong number of arguments. This tool will print first n records from a comma separated CSV file.' 
print >>sys.stderr, 'Usage:' 
print >>sys.stderr, '       python', sys.argv[0], '<file> <number-of-lines>'
sys.exit(1)

fileName = sys.argv[1]
n = int(sys.argv[2])

i = 0 
out = csv.writer(sys.stdout, delimiter=',', quotechar='"', quoting=csv.QUOTE_NONNUMERIC)

ret = []


def read_csv(file_path, has_header = True):
    with open(file_path) as f:
        if has_header: f.readline()
        data = []
        for line in f:
            line = line.strip().split("\",\"")
            data.append([x for x in line])
    return data


ret = read_csv(fileName)
target = []
train = []
target = [x[2] for x in ret]
train = [x[1] for x in ret]

錯誤:

    target = [x[2] for x in ret]
IndexError: list index out of range

您正在混合file.readline()並將文件對象用作可迭代對象。 不要那樣做 使用next()代替。

您還應該使用csv.reader()模塊讀取數據,而不要csv.reader() 在任何情況下, csv模塊都可以更好地處理帶引號的CSV值,並在值中嵌入定界符:

import csv

def read_csv(file_path, has_header=True):
    with open(file_path, 'rb') as f:
        reader = csv.reader(f)
        if has_header: next(reader, None)
        return list(reader)

最后但並非最不重要的一點是,您可以使用zip()轉置行和列:

ret = read_csv(fileName)
target, train = zip(*ret)[1:3]  # just the 2nd and 3rd columns

此處的zip()將在第一列沒有足夠多的地方停止,至少要避免您看到的異常。

如果某些行中缺少列,請改用itertools.izip_longest() (Python 3中為itertools.zip_longest() ):

from itertools import izip_longest

ret = read_csv(fileName)
target, train = izip_longest(*ret)[1:3]  # just the 2nd and 3rd columns

默認值是將None替換為缺少的列; 如果需要使用其他值, fillvalue參數傳遞給izip_longest()

target, train = izip_longest(*ret, fillvalue=0)[1:3]  # just the 2nd and 3rd columns

暫無
暫無

聲明:本站的技術帖子網頁,遵循CC BY-SA 4.0協議,如果您需要轉載,請注明本站網址或者原文地址。任何問題請咨詢:yoyou2525@163.com.

 
粵ICP備18138465號  © 2020-2024 STACKOOM.COM