如何讀取 csv 文件中的最大行數？

Question

我有一個 python 腳本，它讀取一堆csv文件並創建一個新的csv文件，其中包含讀取的每個文件的最后一行。 腳本是這樣的：

    import pandas as pd
    import glob
    import os

    path = r'Directory of the files read\*common_file_name_part.csv'
    r_path = r'Directory where the resulting file is saved.'
    if os.path.exists(r_path + 'csv'):
       os.remove(r_path + 'csv')
    if os.path.exists(r_path + 'txt'):
       os.remove(r_path + 'txt')

    files = glob.glob(path)
    column_list = [None] * 44
    for i in range(44):
        column_list[i] = str(i + 1)

    df = pd.DataFrame(columns = column_list)
    for name in files:
        df_n = pd.read_csv(name, names = column_list)
        df = df.append(df_n.iloc[-1], ignore_index=True)
        del df_n

    df.to_csv(r_path + 'csv', index=False, header=False)
    del df

這些文件都有一個共同的名字結尾和一個真正的名字開頭。 生成的文件沒有擴展名，所以我可以做一些檢查。 我的問題是這些文件的行數和列數是可變的，即使在同一個文件中，我也無法正確讀取它們。 如果我不指定列名，程序會將第一行假定為列名，這會導致某些文件丟失很多列。 另外，我嘗試通過編寫以下內容來讀取沒有標題的文件：

    df = pd.read_csv(r_path, header=None)

但它似乎不起作用。 我想上傳一些文件作為示例，但我不知道。 如果有人知道我會很樂意這樣做

Answer 1

您可以預處理您的文件，以填充少於最大列數的行。 參考： Python csv； 獲取所有列的最大長度，然后將所有其他列延長到該長度

您還可以使用 sep 參數，或者，如果它無法正確讀取您的 CSV，則將文件讀取為固定寬度。 查看這個問題的答案： Read CSV into a dataFrame with different row length using Pandas

Answer 2

看起來你實際上有兩個問題：

獲取所有文件中所有列的完整列表
從每個文件中讀取最后一行並合並到正確的列中

為了解決這個問題，標准的 Python csv模塊比 Pandas 更有意義。

我假設你已經確定了你需要的文件列表，它在你的files變量中

首先獲取所有標題

import csv

# Use a set to eliminate eleminate duplicates
headers = set()

# Read the header from each file
for file in files:
    with open(file) as f:
        reader = csv.reader(f)

        # Read the first line as this will be the header
        header = next(reader)

        # Update the set with the list of headers
        headers.update(header)

print("Headers:", headers)

現在讀取最后幾行並將它們寫入結果文件

使用DictReader和DictWriter提供映射到標題的dict 。

with open(r_path, "w") as f_out:
    # The option extrasaction="ignore" allows for not
    # all columns to be provided when calling writerow
    writer = DictWriter(f_out, fieldnames=headers, extrasaction="ignore")
    writer.writeheader()

    # Read the last line of each file
    for file in files:
        with open(file) as f_in:
            reader = csv.DictReader(f_in)

            # Read all and ignore only keep the last line
            for row in reader: 
                pass

            # Write the last row into the result file
            writer.writerow(row)

如何讀取 csv 文件中的最大行數？

問題描述

2 個解決方案

解決方案1
0 2019-07-30 10:42:24

解決方案2
0 2019-07-30 11:02:25

如何讀取 csv 文件中的最大行數？

問題描述

2 個解決方案

解決方案1 0 2019-07-30 10:42:24

解決方案2 0 2019-07-30 11:02:25

解決方案1
0 2019-07-30 10:42:24

解決方案2
0 2019-07-30 11:02:25