與Python的存在如果有其他的數據解析CSV文件

Question

數據文件如下所示：

"2015","21","2","RICK","D","w","1","1","f","8","","00","","","","","S"
"2015","56","5","RICK","E","g","1","1","k","8","","15","","","","","F"

如果最后一個字段是“ S”，則僅需要將第三個字段添加到總計中。 否則，該行將被跳過。

我嘗試導入CSV並使用以下內容：

for line in csv.reader(file, quotechar='"', delimiter=',', quoting=csv.QUOTE_ALL, skipinitialspace=True):
if line[16] == "S":
    total = total + line[2]

這告訴我“ IndexError：列表索引超出范圍”。 也許有更好的方法。 我以為Import CSV將為我完成大部分工作。 最好的方法是什么？ 在這一點上，我將采取任何可行的方法。

打印一行顯示以下內容：

['"2015"', '"43"', '"2"', '"ZETA"', '"W"', '"x"', '"1"', '"1"', '"d"', '"2"', '""', '"31"', '""', '""', '""', '""', '"N"']

Answer 1

pandas可以輕松做到這一點：

In [52]:
# read the csv into a dataframe
df = pd.read_csv(r'c:\data\sample.txt', quotechar="\"", header=None)
df
Out[52]:
     0   1   2     3  4  5   6   7  8   9   10  11  12  13  14  15 16
0  2015  21   2  RICK  D  w   1   1  f   8 NaN   0 NaN NaN NaN NaN  S
1  2015  56   5  RICK  E  g   1   1  k   8 NaN  15 NaN NaN NaN NaN  F
In [55]:
# we can filter the values and then call count()
df.loc[df[16] == 'S',16].count()
Out[55]:
1
In [56]:
# we can also show the count for all unique values
df[16].value_counts()
Out[56]:
S    1
F    1
dtype: int64

Answer 2

=值從右側操作數分配給左側操作數

if line[16] = "S": if line[16] == "S":

hzhang@dell-work ~ $ cat sample.csv 
"2015","21","2","RICK","D","w","1","1","f","8","","00","","","","","S"
"2015","56","5","RICK","E","g","1","1","k","8","","15","","","","","F"
hzhang@dell-work ~ $ cat test.py 
import csv
with open("sample.csv", "rb") as csvfile:
    csvreader = csv.reader(csvfile, delimiter=",")
    total = 0
    for line in csvreader:
        if line[16] =="S":
            total = total + int(line[2])

    print "total is:{}".format(total)
hzhang@dell-work ~ $ python test.py 
total is:2

根據您的代碼：

import csv
file = open("sample.csv")
total = 0
for line in csv.reader(file, quotechar='"', delimiter=',', quoting=csv.QUOTE_ALL, skipinitialspace=True):
    if line[16] == "S":
        total = total + int(line[2])

file.close()
print "total:{}".format(total)
hzhang@dell-work ~ $ python test.py 
total:2

請確保所有輸入行都有17個字段，並在匯總它們之前轉換每個字段的第3列。

檢查哪些行沒有17個字段。 如果len（line）！= 17：打印行

Answer 3

該文件可能不一致地包含17列。 要做到這一點是，如果沒有在文件的結尾額外的換行符的一種方式。

這是檢測哪條線引起問題的方法。

reader = csv.reader(file, quotechar='"', delimiter=',', quoting=csv.QUOTE_ALL, skipinitialspace=True)
for line_num, line in enumerate(reader, start=1):
    try:
        if line[16] == "S":
            total = total + line[2]
    except IndexError:
        # show offending line
        print(line_num, line)
        # reraise to halt execution
        raise

Answer 4

您可能考慮使用負數組索引從數組末尾訪問項目：

total = 0
for line in cvs.reader(...):
    if line[-1] == "S":
        total += int(line[2])

與Python的存在如果有其他的數據解析CSV文件

問題描述

4 個解決方案

解決方案1
0 2015-05-06 21:29:15

解決方案2
0 2015-05-06 21:31:35

解決方案3
0 已采納 2015-05-06 21:37:08

解決方案4
-1 2015-05-06 21:34:08

與Python的存在如果有其他的數據解析CSV文件

問題描述

4 個解決方案

解決方案1 0 2015-05-06 21:29:15

解決方案2 0 2015-05-06 21:31:35

解決方案3 0 已采納 2015-05-06 21:37:08

解決方案4 -1 2015-05-06 21:34:08

解決方案1
0 2015-05-06 21:29:15

解決方案2
0 2015-05-06 21:31:35

解決方案3
0 已采納 2015-05-06 21:37:08

解決方案4
-1 2015-05-06 21:34:08