简体   繁体   English

Python / Pandas-遍历文件夹,将文件附加到数据框(如果它们在第二个数据框中)

[英]Python / Pandas - Looping through a folder appending files to a data frame if they are in a second data frame

I am having a hard time doing something that seems somewhat simple: 我很难做一些看似简单的事情:

Part I Creating an empty dataframe to store my data 第一部分创建一个空的数据框来存储我的数据

Part II I am using python to iterate through a folder, and look for macro enable excel files 第二部分我使用python遍历文件夹,并寻找启用Excel的宏文件

Part III This is where I had a hard time -- Ideally I want to see if 'i' is in the data frame "file_df"'s column "File_Name" then append it to the FileList data frame. 第三部分这是我很难过的地方-理想情况下,我想查看数据框“ file_df”的“ File_Name”列中是否存在“ i”,然后将其附加到FileList数据框。 Note: this column in the file_df data frame is just a list of the files I actually want to use from the folder. 注意:file_df数据框中的此列只是我实际上要从该文件夹中使用的文件的列表。

   import pandas as pd
   import glob 
   import os

   #Part I
   FileList = pd.DataFrame(index=file_df.index, columns=['File_Name'])

   # Part II
   os.chdir(path)
   for i in glob.glob('*.xlsm'): # gives list of files from the folder

   # Part III
   if file_df[file_df['File_Name'].str.contains(i)]:
        FileList.append(i)

I would probably do something like this 我可能会做这样的事情

import pandas as pd
import os
str1 = "i"
fileList=[]
for subdir, dirs, files in os.walk(path): # iterating over the files and sub folders
    for file in files:
        if file.endswith(("*.xlsm")):  # finding a file with pdf extension
            a = os.path.join(subdir, file)   # if we find the file we extract its path

            filename = a.rsplit('/')[-1]  # gets us filename with extension
            if str1 in filename:
            fileList.append(filename)

声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM