如何通过 python pandas&csv 跳过众多 CSV 文件的某些行？

Question

I have put numerous CSV files in a fold and would like to skip the certain row (eg the 10th row) first, and then take one row every five lines.我已将许多 CSV 文件放在一个折叠中，并想先跳过某行（例如第 10 行），然后每五行取一行。
I could do the first step however have no idea about the second one.我可以做第一步，但不知道第二步。

Thanks.谢谢。

import pandas as pd
import csv, os


# Loop through every file in the current working directory.
for csvFilename in os.listdir('path'):
    if not csvFilename.endswith('.csv'):
        continue
    # Now let's read the dataframe
    # total row number
    total_line = len(open('path' + csvFilename).readlines())
    # put the first and last to a list
    line_list = [total_line] + [1]
    df = pd.read_csv('path' + csvFilename, skiprows=line_list)
    new_file_name = csvFilename

    # And output
    df.to_csv('path' + new_file_name, index=False)

The correct code is shown as follows.正确的代码如下所示。

import numpy as np
import pandas as pd
import csv, os

# Loop through every file in the current working directory.
for csvFilename in os.listdir('path'):
    if not csvFilename.endswith('.csv'):
        continue
    # Now let's read the dataframe
    total_line = len(open('path' + csvFilename).readlines())
    skip = np.arange(total_line)
    # skip 5 rows
    skip = np.delete(skip, np.arange(0, total_line, 5))
    # skip the certain row you would like, e.g. 10
    skip = np.append(skip, 10)
    df = pd.read_csv('path' + csvFilename, skiprows=skip)

    new_file_name = '2' + csvFilename
    # And output
    df.to_csv('path' + new_file_name, index=False)

Answer 1

You can use a function with skiprows .您可以将 function 与skiprows一起使用。

I edited your code below:我在下面编辑了您的代码：

    import numpy as np  
    import csv, os  

    # Loop through every file in the current working directory.
    for csvFilename in os.listdir('path'):
        if not csvFilename.endswith('.csv'):
            continue
        # Now let's read the dataframe
        total_line = len(open('path' + csvFilename).readlines())

        df = pd.read_csv('path' + csvFilename, skiprows=lambda x: x in list(range(total_line))[1:-1:5])

        new_file_name = csvFilename
        # And output
        df.to_csv('path' + new_file_name, index=False)

如何通过 python pandas&csv 跳过众多 CSV 文件的某些行？

问题描述

1 个解决方案

解决方案1
1 2020-04-29 10:01:44

如何通过 python pandas&amp;csv 跳过众多 CSV 文件的某些行？

问题描述

1 个解决方案

解决方案1 1 2020-04-29 10:01:44

如何通过 python pandas&csv 跳过众多 CSV 文件的某些行？

解决方案1
1 2020-04-29 10:01:44