Create unique identifier for files using Python

Question

I am looking for a robust solution to define a unique identifier for measurement data files. I collect the data from different sources, mainly from network storage. The data files might be renamed and copied more than once to different locations. The method only needs to run on Windows platform. So far I do the following: create an ID from the last modification time and the size of the file. I assume that the file will only once be created during the measurement process and never be modified afterwards. This is my current implementation:

import pathlib
import datetime

def file_uid(file):

    fname = pathlib.Path(file)
    mod_time = datetime.datetime.fromtimestamp(fname.stat().st_mtime).strftime("%d.%m.%Y %H:%M:%S")
    file_size = fname.stat().st_size
    uid = '%s%s%s' %(mod_time,'_',str(file_size))
    return uid

Can this idea work, or did I miss something in general? What will be the best practice to accomplish a robust solution for this issue? Or should I go with some checksum algorithm and what would be recommended?

Answer 1

I would recommend assigning each file a short UDID. you can use something such as shortuuid:

pip install shortuuid

and then just

shortuuid.ShortUUID().random(length=22)

Create unique identifier for files using Python

Question

1 answers

solution1
0 2020-04-03 08:06:44

Create unique identifier for files using Python

Question

1 answers

solution1 0 2020-04-03 08:06:44

solution1
0 2020-04-03 08:06:44