Fastest way to store and save coordinate data in a loop

Question

I getting face landmarks for each frame in a video. There are 477 landmarks, and each one is a (3,) vector.

I have a 10 minute video at 30 fps. This means that I have 18000 arrays of shape (477,3) . I want to store all this info in a pandas dataframe where each row is a frame and has 477 columns, one for each (3,) array.

Currently, I am doing this:

frame_lms = []
for frame in video:
    landmark_dict = {}
    lm_count = 0
    for landmark in frame:
        x = landmark.x
        y = landmark.y
        xy = np.array([x,y])
        landmark_dict[f"lm_{count}"] = xy
        lm_count+=1
    frame_lms.append(landmark_dict)
df = pd.DataFrame.from_dict(frame_lms)
df.to_csv('save.csv')

I got the idea to store everything in a list of dicts, append to a list, and then save from research showing that from_dict is the fastest way to create a pandas df. However, this process is still slow because I have to hold frame_lms in state, which gets huge as I append (477,3) arrays into it.

What is the most computationally efficeint way to solve a problem like this?

Answer 1

It is better to avoid creating so many objects in a loop. You get the same results (headers omitted) using:

import numpy as np

# load video here

storage = np.empty((len(video), 477, 2))

for frame, s_line in zip(video, storage):
    for landmark, l_buf in zip(frame, s_line):
        l_buf[0] = landmark.x
        l_buf[1] = landmark.y

Now storage has all the data you need. Note that the code can be improved.

Fastest way to store and save coordinate data in a loop

Question

1 answers

solution1
1 ACCPTED 2021-12-09 17:00:05

Fastest way to store and save coordinate data in a loop

Question

1 answers

solution1 1 ACCPTED 2021-12-09 17:00:05

solution1
1 ACCPTED 2021-12-09 17:00:05