有沒有更好的讀取文件的方法？

Question

每次我使用這種長方法將 CSv 文件作為列表讀取時，我們可以簡化嗎？

創建空列表
逐行讀取文件並附加到列表中

filename = 'mtms_excelExtraction_m_Model_Definition.csv'
Ana_Type = []
Ana_Length = []
Ana_Text = []
Ana_Space = []                                                                                                                                                                                                                                                                     
with open(filename, 'rt') as f:  
    reader = csv.reader(f)   
    try:
        for row in reader:
            Ana_Type.append(row[0])
            Ana_Length.append(row[1])
            Ana_Text.append(row[2])
            Ana_Space.append(row[3])            
    except csv.Error as e:
        sys.exit('file %s, line %d: %s' % (filename, reader.line_num, e))

Answer 1

這是您開始使用pandas和使用 DataFrames 的好機會。

import pandas as pd

df = pd.read_csv(path_to_csv)

1-2 行（取決於您是否計算導入）代碼行，您就完成了！

Answer 2

這個本質上是numpy處理csv文件的方式，不使用numpy。 它是否比你原來的方法更好，這接近於一個品味問題。 它與 numpy 或 Pandas 方法的共同之處在於將整個文件加載到 memory 中，而不是將其轉換為列表：

with open(filename, 'rt') as f:  
    reader = csv.reader(f)   
    tmp = list(reader)
Ana_Type, Ana_Length, Ana_Text, Ana_Space = [[tmp[i][j] for i in range(len(tmp))]
                                             for j in range(len(tmp[0]))]

它使用更少的代碼，並使用推導式而不是重復附加來構建 arrays，但更多的是 memory（numpy 或 pandas 也是如此）。

根據您以后如何處理數據，numpy 或 Pandas 可能是一個不錯的選擇。 因為恕我直言，僅使用它們將 csv 文件加載到列表中是不值得的。

Answer 3

您可以使用DictReader

import csv

with open(filename, 'rt') as f:  
    data = list(csv.DictReader(f, fieldnames=["Type", "Length", "Text", "Space"]))

print(data)

這將為您提供一個dict對象list ，每行一個。

Answer 4

這可能很有用：

import numpy as np
# read the rows with Numpy
rows = np.genfromtxt('data.csv',dtype='str',delimiter=';')
# call numpy.transpose to convert the rows to columns
cols = np.transpose(rows)

# get the stuff as lists
Ana_Type = list(cols[0])
Ana_Length = list(cols[1])
Ana_Text = list(cols[2])
Ana_Space = list(cols[0])

編輯：請注意，第一個元素將是列的名稱（帶有測試數據的示例）：

['Date', '2020-03-03', '2020-03-04', '2020-03-05', '2020-03-06']

Answer 5

嘗試這個

import csv
from collections import defaultdict
d = defaultdict(list)
with open(filename, mode='r') as csv_file:
    csv_reader = csv.DictReader(csv_file)
    for row in csv_reader:
        for k,v in row.items():
            d[k].append(v)

下一個

d.keys()
dict_keys(['Ana_Type', 'Ana_Length', 'Ana_Text', 'Ana_Space'])

下一個

d.get('Ana_Type')
['bla','bla1','df','ccc']

Answer 6

重復調用list.append可以通過讀取 csv 並使用zip內置 ZC1C4252678E683894D1AB45 轉置 C17 行來避免。

import io, csv

# Create an example file
buf = io.StringIO('type1,length1,text1,space1\ntype2,length2,text2,space2\ntype3,length3,text3,space3')

reader = csv.reader(buf)
# Uncomment the next line if there is a header row
# next(reader)

Ana_Types, Ana_Length, Ana_Text, Ana_Space = zip(*reader)

print(Ana_Types)
('type1', 'type2', 'type3')
print(Ana_Length)
('length1', 'length2', 'length3')
...

如果您需要列表而不是元組，您可以使用列表或生成器推導來轉換它們：

Ana_Types, Ana_Length, Ana_Text, Ana_Space = [list(x) for x in zip(*reader)]

有沒有更好的讀取文件的方法？

問題描述

6 個解決方案

解決方案1
2 已采納 2020-07-23 14:37:02

解決方案2
2 2020-07-23 14:52:05

解決方案3
1 2020-07-23 14:32:56

解決方案4
1 2020-07-23 14:35:41

解決方案5
1 2020-07-23 15:28:18

解決方案6
1 2020-07-23 15:29:12

有沒有更好的讀取文件的方法？

問題描述

6 個解決方案

解決方案1 2 已采納 2020-07-23 14:37:02

解決方案2 2 2020-07-23 14:52:05

解決方案3 1 2020-07-23 14:32:56

解決方案4 1 2020-07-23 14:35:41

解決方案5 1 2020-07-23 15:28:18

解決方案6 1 2020-07-23 15:29:12

解決方案1
2 已采納 2020-07-23 14:37:02

解決方案2
2 2020-07-23 14:52:05

解決方案3
1 2020-07-23 14:32:56

解決方案4
1 2020-07-23 14:35:41

解決方案5
1 2020-07-23 15:28:18

解決方案6
1 2020-07-23 15:29:12