简体   繁体   中英

Fastest way to generate the absolute paths of all the files in a folder recursively in Python?

My requirement is to generate a list of the absolute path of all the files in a Windows folder recursively.

I have tried glob.glob and glob.iglob but both take around 1.5 hours to generate the list.

Now it wouldn't be a problem if it was a one-off situation. This program would run daily - and this is where the problem starts.

After the first time run, I would only need to get the list of the latest modified files (after the last run, using the timestamp of that run). This is so that even if the first run takes 1.5 hours, the daily run should take a few minutes at max. Now there is no way I can get the timestamp of each individual file without going through each of the files which means I would need to go through the entire folder regardless and check for the timestamp on each of those.

Can I optimize this in any way? Sure ideally, the data would be arranged by date but it's not the case.

I'm not sure if there are pure Python approaches that will be faster. You basically want to have a service running that keeps track of changes in real-time. In principle you could put the whole folder under version control (Git), and check the pending changes once per day.

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM