在python中迭代特定的csv行會輸出一個空白文件

Question

python newb here - 我正在嘗試格式化一組非常粗略的csv我被發送，以便我可以將它們放入一個很好的postgres表中進行查詢和分析。 為了做到這一點，我首先使用csv.writer清除它們以刪除包裝每個條目的空行和雙引號。 這是我的代碼的樣子：

import os
import csv
import glob
from itertools import islice

files = glob.glob('/Users/foo/bar/*.csv')

# Loop through all of the csv's  
for file in files:
    # Get the filename from the path
    outfile = os.path.basename(file)

    with open(file, 'rb') as inp, open('/Users/foo/baz/' + outfile, 'wb') as out:

        reader = csv.reader(inp)
        writer = csv.writer(out)
        for row in reader:
            if row:
                writer.writerow(row)
        out.close()

它工作得很好，完全符合我的要求。 輸出csv看起來很棒。 接下來，我嘗試基本上從新清理的csv文件的開頭和結尾切掉包含完全不必要的垃圾的一定數量的行（省略前8行和后2行）。 由於我無法確定的原因，csv從代碼的這一部分輸出（縮寫與之前的'with'塊相同）完全為空：

with open('/Users/foo/baz/' + outfile, 'rb') as inp2, open('/Users/foo/qux/' + outfile, 'wb') as out2:
    writer2 = csv.writer(out2)
    reader2 = csv.reader(inp2)
    row_count = sum(1 for row in reader2)
    last_line_index = row_count - 3 
    for row in islice(reader2, 7, last_line_index):
            writer2.writerow(row)
    out2.close()

我知道，因為我的“帶”的使用，關閉（）在每個塊的結尾是多余的-我想它作為一種方法后看這里。 我還嘗試將第二個'with'塊放入另一個文件中並在運行第一個'with'塊后運行它，但仍無濟於事。 非常感謝您的幫助！

另外，這是整個文件：

import os
import csv
import glob
from itertools import islice

files = glob.glob('/Users/foo/bar/*.csv')

# Loop through all of the csv's  
for file in files:
    # Get the filename from the path
    outfile = os.path.basename(file)

    with open(file, 'rb') as inp, open('/Users/foo/baz/' + outfile, 'wb') as out:

        reader = csv.reader(inp)
        writer = csv.writer(out)
        for row in reader:
            if row:
                writer.writerow(row)
        out.close() 

    with open('/Users/foo/baz/' + outfile, 'rb') as inp2, open('/Users/foo/qux/' + outfile, 'wb') as out2:
        writer2 = csv.writer(out2)
        reader2 = csv.reader(inp2)
        row_count = sum(1 for row in reader2)
        last_line_index = row_count - 3 
        for row in islice(reader2, 7, last_line_index):
                writer2.writerow(row)
        out2.close()

謝謝！

Answer 1

有罪的一方是

row_count = sum(1 for row in reader2)

它從reader2讀取所有數據; 現在，當您嘗試for row in islice(reader2, 7, last_line_index)您不會獲得任何數據。

此外，您可能正在閱讀大量空白行，因為您將文件打開為二進制文件; 相反

with open('file.csv', newline='') as inf:
    rd = csv.reader(inf)

Answer 2

您可以快速修復這樣的代碼（我對該問題進行了評論，正如@Hugh Bothwell所說，您已經讀取了變量reader2所有數據）：

import os
import csv
import glob
from itertools import islice

files = glob.glob('/Users/foo/bar/*.csv')

# Loop through all of the csv's  
for file in files:
    # Get the filename from the path
    outfile = os.path.basename(file)

    with open(file, 'rb') as inp, open('/Users/foo/baz/' + outfile, 'wb') as out:

        reader = csv.reader(inp)
        writer = csv.writer(out)
        for row in reader:
            if row:
                writer.writerow(row)
        out.close() 

    with open('/Users/foo/baz/' + outfile, 'rb') as inp2, open('/Users/foo/qux/' + outfile, 'wb') as out2:
            writer2 = csv.writer(out2)
            reader2 = csv.reader(inp2)
            row_count = sum(1 for row in csv.reader(inp2)) #here you separately count the amount of rows without read the variable reader2
            last_line_index = row_count - 3 
            for row in islice(reader2, 7, last_line_index):
                    writer2.writerow(row)
            out2.close()

在python中迭代特定的csv行會輸出一個空白文件

問題描述

2 個解決方案

解決方案1
2 已采納 2017-06-16 00:04:37

解決方案2
1 2017-06-16 00:08:39

在python中迭代特定的csv行會輸出一個空白文件

問題描述

2 個解決方案

解決方案1 2 已采納 2017-06-16 00:04:37

解決方案2 1 2017-06-16 00:08:39

解決方案1
2 已采納 2017-06-16 00:04:37

解決方案2
1 2017-06-16 00:08:39