[英]How to skip certain rows of numerous CSV files by python pandas&csv?
I have put numerous CSV files in a fold and would like to skip the certain row (eg the 10th row) first, and then take one row every five lines.我已将许多 CSV 文件放在一个折叠中,并想先跳过某行(例如第 10 行),然后每五行取一行。
I could do the first step however have no idea about the second one.我可以做第一步,但不知道第二步。
Thanks.谢谢。
import pandas as pd
import csv, os
# Loop through every file in the current working directory.
for csvFilename in os.listdir('path'):
if not csvFilename.endswith('.csv'):
continue
# Now let's read the dataframe
# total row number
total_line = len(open('path' + csvFilename).readlines())
# put the first and last to a list
line_list = [total_line] + [1]
df = pd.read_csv('path' + csvFilename, skiprows=line_list)
new_file_name = csvFilename
# And output
df.to_csv('path' + new_file_name, index=False)
The correct code is shown as follows.正确的代码如下所示。
import numpy as np
import pandas as pd
import csv, os
# Loop through every file in the current working directory.
for csvFilename in os.listdir('path'):
if not csvFilename.endswith('.csv'):
continue
# Now let's read the dataframe
total_line = len(open('path' + csvFilename).readlines())
skip = np.arange(total_line)
# skip 5 rows
skip = np.delete(skip, np.arange(0, total_line, 5))
# skip the certain row you would like, e.g. 10
skip = np.append(skip, 10)
df = pd.read_csv('path' + csvFilename, skiprows=skip)
new_file_name = '2' + csvFilename
# And output
df.to_csv('path' + new_file_name, index=False)
You can use a function with skiprows
.您可以将 function 与
skiprows
一起使用。
I edited your code below:我在下面编辑了您的代码:
import numpy as np
import csv, os
# Loop through every file in the current working directory.
for csvFilename in os.listdir('path'):
if not csvFilename.endswith('.csv'):
continue
# Now let's read the dataframe
total_line = len(open('path' + csvFilename).readlines())
df = pd.read_csv('path' + csvFilename, skiprows=lambda x: x in list(range(total_line))[1:-1:5])
new_file_name = csvFilename
# And output
df.to_csv('path' + new_file_name, index=False)
声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.