简体   繁体   English

使用glob时添加文件时间戳列的修改

[英]Add modification of file time stamp column while using glob

I have multiple files in a folder that get modified by users at different times. 我在一个文件夹中有多个文件,用户可以在不同时间修改这些文件。 Every week I go and consolidate them all in to one master file but I need to keep track when the file was modified last time. 每周我都会将它们整合到一个主文件中,但我需要跟踪上次修改文件的时间。 This is a manual process which I am trying to automate. 这是一个手动过程,我试图自动化。

I wrote the glob code but cant seem to be able to add a column that would provide modification times from each file in to the master file 我编写了glob代码,但似乎无法添加一个列,该列可以将每个文件的修改时间提供给主文件

all_data = pd.DataFrame()
for f in glob.glob("..\Python_Practice\Book*.xlsx"):
    df = pd.read_excel(f)
    all_data = all_data.append(df, ignore_index=True)
all_data.head()


all_data[time] = time.strftime('%m%d%H%M', os.path.gmtime('file')

It doesn't really work and cant find anything on the forums that might do something like it 它并没有真正起作用,也无法在论坛上找到任何类似的东西

you are close, but you need to loop through your files and pass the os.path.getmtime into a list. 你很接近,但是你需要遍历你的文件并将os.path.getmtime传递给一个列表。 you can then pass these to the index. 然后,您可以将这些传递给索引。

The following will 以下是

  • Find all .xlsx files 找到所有的.xlsx文件
  • merge them into one list 将它们合并到一个列表中
  • get the last modified unix time 得到最后修改的unix时间
  • convert the unix time into a datetime 将unix时间转换为日期时间
  • concat the dataframes into a single one and pass the datetime into the index. 将数据帧连接成一个数据帧并将日期时间传递给索引。

      from datetime import datetime allFiles = glob.glob('*.xlsx') dfs = [pd.read_excel(f) for f in allFiles] keys = [datetime.fromtimestamp(os.path.getmtime (f)).strftime('%Y-%m-%d %H:%M:%S') for f in allFiles] frame = pd.concat(dfs, keys=keys) 
  • I would try to use the timestamp at the moment each file is processed. 我会尝试在处理每个文件时使用时间戳。 You code could become: 你的代码可以成为:

    all_data = pd.DataFrame()
    for f in glob.glob("..\Python_Practice\Book*.xlsx"):
        df = pd.read_excel(f)
        df['time'] = time.strftime('%m%d%H%M', os.path.gmtime('file')
        all_data = all_data.append(df, ignore_index=True)
    all_data.head()
    

    声明:本站的技术帖子网页,遵循CC BY-SA 4.0协议,如果您需要转载,请注明本站网址或者原文地址。任何问题请咨询:yoyou2525@163.com.

     
    粤ICP备18138465号  © 2020-2024 STACKOOM.COM