如何使用帶有完整標題的python導入csv文件，其中第一列是非數字

Question

這是對上一個問題的詳細說明，但是當我深入研究 python 時，我對 python 如何處理 csv 文件更加困惑。

我有一個 csv 文件，它必須保持這種狀態（例如，無法將其轉換為文本文件）。 它相當於一個 5 行 x 11 列的數組或矩陣或向量。

我一直在嘗試使用我在這里和其他地方（例如python.org ）找到的各種方法讀取 csv，以便它保留列和行之間的關系，其中第一行和第一列 = 非數字值。 其余的是浮點值，包含正浮點數和負浮點數的混合。

我想要做的是導入 csv 並在 python 中編譯它，這樣如果我要引用列標題，它將返回存儲在行中的關聯值。 例如：

>>> workers, constant, age
>>> workers
    w0
    w1
    w2
    w3
    constant
    7.334
    5.235
    3.225
    0
    age
    -1.406
    -4.936
    -1.478
    0

等等……

我正在尋找處理這種數據結構的技術。 我對python很陌生。

Answer 1

對於 Python 3

刪除rb參數並使用r或不傳遞參數（ default read mode ）。

with open( <path-to-file>, 'r' ) as theFile:
    reader = csv.DictReader(theFile)
    for line in reader:
        # line is { 'workers': 'w0', 'constant': 7.334, 'age': -1.406, ... }
        # e.g. print( line[ 'workers' ] ) yields 'w0'
        print(line)

對於 Python 2

import csv
with open( <path-to-file>, "rb" ) as theFile:
    reader = csv.DictReader( theFile )
    for line in reader:
        # line is { 'workers': 'w0', 'constant': 7.334, 'age': -1.406, ... }
        # e.g. print( line[ 'workers' ] ) yields 'w0'

Python 有一個強大的內置 CSV 處理程序。 事實上，大多數東西已經內置到標准庫中。

Answer 2

Python 的 csv 模塊按行處理數據，這是查看此類數據的常用方法。 您似乎想要一種按列的方法。 這是一種方法。

假設您的文件名為myclone.csv並包含

workers,constant,age
w0,7.334,-1.406
w1,5.235,-4.936
w2,3.2225,-1.478
w3,0,0

這段代碼應該給你一兩個想法：

>>> import csv
>>> f = open('myclone.csv', 'rb')
>>> reader = csv.reader(f)
>>> headers = next(reader, None)
>>> headers
['workers', 'constant', 'age']
>>> column = {}
>>> for h in headers:
...    column[h] = []
...
>>> column
{'workers': [], 'constant': [], 'age': []}
>>> for row in reader:
...   for h, v in zip(headers, row):
...     column[h].append(v)
...
>>> column
{'workers': ['w0', 'w1', 'w2', 'w3'], 'constant': ['7.334', '5.235', '3.2225', '0'], 'age': ['-1.406', '-4.936', '-1.478', '0']}
>>> column['workers']
['w0', 'w1', 'w2', 'w3']
>>> column['constant']
['7.334', '5.235', '3.2225', '0']
>>> column['age']
['-1.406', '-4.936', '-1.478', '0']
>>>

要將您的數值轉換為浮點數，請添加此

converters = [str.strip] + [float] * (len(headers) - 1)

在前面，然后執行此操作

for h, v, conv in zip(headers, row, converters):
  column[h].append(conv(v))

對於每一行，而不是上面類似的兩行。

Answer 3

您可以使用 pandas 庫並像這樣引用行和列：

import pandas as pd

input = pd.read_csv("path_to_file");

#for accessing ith row:
input.iloc[i]

#for accessing column named X
input.X

#for accessing ith row and column named X
input.iloc[i].X

Answer 4

我最近不得不為相當大的數據文件編寫這種方法，我發現使用列表理解效果很好

      import csv
      with open("file.csv",'r') as f:
        reader = csv.reader(f)
        headers = next(reader)
        data = [{h:x for (h,x) in zip(headers,row)} for row in reader]
        #data now contains a list of the rows, with each row containing a dictionary 
        #  in the shape {header: value}. If a row terminates early (e.g. there are 12 columns, 
        #  it only has 11 values) the dictionary will not contain a header value for that row.

如何使用帶有完整標題的python導入csv文件，其中第一列是非數字

問題描述

4 個解決方案

解決方案1
139 2010-08-06 23:49:54

解決方案2
116 已采納 2010-08-07 00:15:58

解決方案3
14 2016-10-08 21:29:35

解決方案4
0 2021-05-03 04:52:37

如何使用帶有完整標題的python導入csv文件，其中第一列是非數字

問題描述

4 個解決方案

解決方案1 139 2010-08-06 23:49:54

解決方案2 116 已采納 2010-08-07 00:15:58

解決方案3 14 2016-10-08 21:29:35

解決方案4 0 2021-05-03 04:52:37

解決方案1
139 2010-08-06 23:49:54

解決方案2
116 已采納 2010-08-07 00:15:58

解決方案3
14 2016-10-08 21:29:35

解決方案4
0 2021-05-03 04:52:37