如果缺少使用 python 的特定列，则删除 CSV 文件

Question

Currently my code looks into CSV files in a folder and replaces strings based on if the file has column 'PROD_NAME' in the data.目前，我的代码查看文件夹中的 CSV 文件，并根据文件在数据中是否包含“PROD_NAME”列来替换字符串。 If it doesnt have column 'PROD_NAME', I'm trying to delete those files in the folder.如果它没有列“PROD_NAME”，我正在尝试删除文件夹中的这些文件。 I can get my code to print which csv files do not have the column with a little debugging, but I cant figure out how to actually delete or remove them from the folder they are in. I have tried an if statement that calls os.remove() and still nothing happens.我可以让我的代码打印哪些 csv 文件没有经过一点调试的列，但我不知道如何从它们所在的文件夹中实际删除或删除它们。我尝试了一个调用 os.remove 的 if 语句() 仍然没有任何反应。 No errors or anything.. it just finishes the script with all the files still in the folder.没有错误或任何东西..它只是完成了所有文件仍在文件夹中的脚本。 Here is my code.这是我的代码。 Any help is appreciated.任何帮助表示赞赏。 Thanks!谢谢！

def worker():
    filenames = glob.glob(dest_dir + '\\*.csv')
    print("Finding all files with column PROD_NAME")
    time.sleep(3)
    print("Changing names of products in these tables...")
    for filename in filenames:
        
        my_file = Path(os.path.join(dest_dir, filename))
        
        try:
            with open(filename):
            # read data
                df1 = pd.read_csv(filename, skiprows=1, encoding='ISO-8859-1') # read column header only - to get the list of columns
                dtypes = {}
                for col in df1.columns:# make all columns text, to avoid formatting errors
                    dtypes[col] = 'str'
                df1 = pd.read_csv(filename, dtype=dtypes, skiprows=1, encoding='ISO-8859-1')

                if 'PROD_NAME' not in df1.columns:
                os.remove(filename)
                    
                #Replaces text in files
                if 'PROD_NAME' in df1.columns: 
                    df1 = df1.replace("NABVCI", "CLEAR_BV")
                    df1 = df1.replace("NAMVCI", "CLEAR_MV")
                    df1 = df1.replace("NA_NRF", "FA_GUAR")
                    df1 = df1.replace("N_FPFA", "FA_FLEX")
                    df1 = df1.replace("NAMRFT", "FA_SECURE_MVA")
                    df1 = df1.replace("NA_RFT", "FA_SECURE")
                    df1 = df1.replace("NSPFA7", "FA_PREFERRED")
                    df1 = df1.replace("N_ENHA", "FA_ENHANCE")
                    df1 = df1.replace("N_FPRA", "FA_FLEX_RETIRE")
                    df1 = df1.replace("N_SELF", "FA_SELECT")
                    df1 = df1.replace("N_SFAA", "FA_ADVANTAGE")
                    df1 = df1.replace("N_SPD1", "FA_SPD1")
                    df1 = df1.replace("N_SPD2", "FA_SPD2")
                    df1 = df1.replace("N_SPFA", "FA_LIFESTAGES")
                    df1 = df1.replace("N_SPPF", "FA_PLUS")
                    df1 = df1.replace("N__CFA", "FA_CHOICE")
                    df1 = df1.replace("N__OFA", "FA_OPTIMAL")
                    df1 = df1.replace("N_SCNI", "FA_SCNI")
                    df1 = df1.replace("NASCI_", "FA_SCI")
                    df1 = df1.replace("NASSCA", "FA_SSC")
                    df1.to_csv(filename, index=False, quotechar="'")            
                
        except:
            if 'PROD_NAME' in df1.columns:
                print("Could not find string to replace in this file: " + filename)
                    
worker()

Answer 1

Written below is a block of code that reads the raw csv data.下面是一段读取原始 csv 数据的代码块。 It extracts the first row of data (containing the column names) and looks for the column name PROD_NAME .它提取第一行数据（包含列名）并查找列名PROD_NAME 。 If it finds it, it sets found to True .如果找到它，它会将found设置为True 。 Else, it sets found to False .否则，它将found设置为False 。 To prevent trying to delete the files whilst open, the removal is done outside of the open() .为了防止在打开时尝试删除文件，删除是在open()之外完成的。


import os

filename = "test.csv"

with open(filename) as f:
    if "PROD_NAME" in f.readlines()[0].split(","):
        print("found")
        found = True
    else:
        print("not found")
        found = False
if not found:
    os.remove(filename)
else:
    pass#Carry out replacements here/load it in pandas

如果缺少使用 python 的特定列，则删除 CSV 文件

问题描述

1 个解决方案

解决方案1
1 已采纳 2022-09-03 01:02:05

如果缺少使用 python 的特定列，则删除 CSV 文件

问题描述

1 个解决方案

解决方案1 1 已采纳 2022-09-03 01:02:05

解决方案1
1 已采纳 2022-09-03 01:02:05