Python：解析多个csv文件并跳过不带关键字的文件

Question

I am trying to read some .csv field data on python for post-processing, I typically just use something like: 我正在尝试读取python上的某些.csv字段数据以进行后处理，我通常只使用以下内容：

for flist in glob('*.csv'):
    df = pd.read_csv(flist, delimiter = ',')

However I need to filter through the bad files which contain "Run_Terminated" somewhere in the file and skip the file entirely. 但是，我需要过滤掉文件中某处包含“ Run_Terminated”的错误文件，然后完全跳过该文件。 I'm still new to python so I'm not familiar with all of its functionalities, any input would be appreciated. 我还是python的新手，所以我不熟悉python的所有功能，任何输入都会受到赞赏。 Thank you. 谢谢。

Answer 1

What you could do is first read the file fully in memory (using a io.StringIO file-like object and look for the Run_Terminated string anywhere in the file (dirty, but should be OK), 您可以做的是首先完全读取内存中的文件（使用io.StringIO文件的对象，并在文件中的任何位置查找Run_Terminated字符串（脏，但应该可以），

Then pass the handle to read_csv (since you can pass a handle OR a filename) so you don't have to read it again from the file. 然后将句柄传递给read_csv （因为您可以传递句柄或文件名），因此您不必从文件中再次读取它。

import pandas as pd
import glob
import io

for flist in glob('*.csv'):
    with open(flist) as f:
        data = io.StringIO()
        data.write(f.read())
    if "Run_Terminated" not in data.getvalue():
        data.seek(0)  # rewind or it won't read anything
        df = pd.read_csv(data, delimiter = ',')

Python：解析多个csv文件并跳过不带关键字的文件

问题描述

1 个解决方案

解决方案1
2 已采纳 2017-02-06 13:56:51

Python：解析多个csv文件并跳过不带关键字的文件

问题描述

1 个解决方案

解决方案1 2 已采纳 2017-02-06 13:56:51

解决方案1
2 已采纳 2017-02-06 13:56:51