简体   繁体   中英

3D points projection using multiple camera

I am using Python with OpenCV 3.4.

I have a system composed of 2 cameras that I want to use to track an object and get its trajectory, then its speed.

I am currently able to calibrate intrinsically and extrinsically each of my cameras. I can track my object through the video and get the 2d coordinates in my video plan.

My problem now is that I would like to project my points from my both 2D plan into 3D points. I've tried functions as triangulatePoints but it seems it's not working in a proper way. Here is my actual function to get 3d coords. It returns some coordinates that seems a little bit off compared to the actual coordinates

def get_3d_coord(left_two_d_coords, right_two_d_coords):

    pt1 = left_two_d_coords.reshape((len(left_two_d_coords), 1, 2))
    pt2 = right_two_d_coords.reshape((len(right_two_d_coords), 1, 2))

    extrinsic_left_camera_matrix, left_distortion_coeffs, extrinsic_left_rotation_vector, \
        extrinsic_left_translation_vector = trajectory_utils.get_extrinsic_parameters(
            1)

    extrinsic_right_camera_matrix, right_distortion_coeffs, extrinsic_right_rotation_vector, \
        extrinsic_right_translation_vector = trajectory_utils.get_extrinsic_parameters(
            2)

    #returns arrays of the same size
    (pt1, pt2) = correspondingPoints(pt1, pt2)



    projection1 = computeProjMat(extrinsic_left_camera_matrix,
                                    extrinsic_left_rotation_vector, extrinsic_left_translation_vector)
    projection2 = computeProjMat(extrinsic_right_camera_matrix,
                                    extrinsic_right_rotation_vector, extrinsic_right_translation_vector)

    out = cv2.triangulatePoints(projection1, projection2, pt1, pt2)

    oc = []
    for idx, elem in enumerate(out[0]):
        oc.append((out[0][idx], out[1][idx], out[2][idx], out[3][idx]))

    oc = np.array(oc, dtype=np.float32)

    point3D = []

    for idx, elem in enumerate(oc):
        W = out[3][idx]
        obj = [None] * 4
        obj[0] = out[0][idx] / W
        obj[1] = out[1][idx] / W
        obj[2] = out[2][idx] / W
        obj[3] = 1

        pt3d = [obj[0], obj[1], obj[2]]
        point3D.append(pt3d)

    return point3D

Here are some screenshot of the 2d trajectory that I get for both my cameras : 第一台摄像机的2d轨迹 第二个摄像头的2d轨迹

Here are some screenshot of the 3d trajectory that we get for the same camera. 第一台相机的3d轨迹 第二个相机的3d轨迹

As you can see the 2d trajectory doesn't look as the 3d one, and I am not able to get a accurate distance between two points. I just would like getting real coordinates, it means knowing the (almost) exact real distance walked by a person even in a curved road.

EDIT to add reference data and examples

Here is some example and input data to reproduce the problem. First, here are some data. 2D points for camera1

546,357 
646,351 
767,357 
879,353 
986,360 
1079,365
1152,364

corresponding 2D for camera2

236,305
313,302
414,308
532,308
647,314
752,320
851,323

3D points that we get from triangulatePoints

"[0.15245444, 0.30141047, 0.5444277]"
"[0.33479974, 0.6477136, 0.25396818]"
"[0.6559921, 1.0416716, -0.2717265]"
"[1.1381898, 1.5703914, -0.87318224]"
"[1.7568599, 1.9649554, -1.5008119]"
"[2.406788, 2.302272, -2.0778883]"
"[3.078426, 2.6655817, -2.6113863]"

In these following images, we can see the 2d trajectory (top line) and the 3d projection reprojected in 2d (bottom line). Colors are alternating to show which 3d points correspond to 2d point.

摄像机1 camera2

And finally here are some data to reproduce.

camera 1 : camera matrix

5.462001610064596662e+02 0.000000000000000000e+00 6.382260289544193483e+02
0.000000000000000000e+00 5.195528638702176067e+02 3.722480290221320161e+02
0.000000000000000000e+00 0.000000000000000000e+00 1.000000000000000000e+00

camera 2 : camera matrix

4.302353276501239066e+02 0.000000000000000000e+00 6.442674231451971991e+02
0.000000000000000000e+00 4.064124751062329324e+02 3.730721752718034736e+02
0.000000000000000000e+00 0.000000000000000000e+00 1.000000000000000000e+00

camera 1 : distortion vector

-1.039009381799949928e-02 -6.875769941694849507e-02 5.573643708806085006e-02 -7.298826373638074051e-04 2.195279856716004369e-02

camera 2 : distortion vector

-8.089289768586239993e-02 6.376634681503455396e-04 2.803641672679824115e-02 7.852965318823987989e-03 1.390248981867302919e-03

camera 1 : rotation vector

1.643658457134109296e+00
-9.626823326237364531e-02
1.019865700311696488e-01

camera 2 : rotation vector

1.698451227150894471e+00
-4.734769748661146055e-02
5.868343803315514279e-02

camera 1 : translation vector

-5.004031689969588026e-01
9.358682517577661120e-01
2.317689087311113116e+00

camera 2 : translation vector

-4.225788801112133619e+00
9.519952012307866251e-01
2.419197507326224184e+00

camera 1 : object points

0 0 0   
0 3 0   
0.5 0 0 
0.5 3 0 
1 0 0   
1 3 0   
1.5 0 0 
1.5 3 0 
2 0 0   
2 3 0  

camera 2 : object points

4 0 0   
4 3 0   
4.5 0 0 
4.5 3 0 
5 0 0   
5 3 0   
5.5 0 0 
5.5 3 0 
6 0 0   
6 3 0  

camera 1 : image points

5.180000000000000000e+02 5.920000000000000000e+02
5.480000000000000000e+02 4.410000000000000000e+02
6.360000000000000000e+02 5.910000000000000000e+02
6.020000000000000000e+02 4.420000000000000000e+02
7.520000000000000000e+02 5.860000000000000000e+02
6.500000000000000000e+02 4.430000000000000000e+02
8.620000000000000000e+02 5.770000000000000000e+02
7.000000000000000000e+02 4.430000000000000000e+02
9.600000000000000000e+02 5.670000000000000000e+02
7.460000000000000000e+02 4.430000000000000000e+02

camera 2 : image points

6.080000000000000000e+02 5.210000000000000000e+02
6.080000000000000000e+02 4.130000000000000000e+02
7.020000000000000000e+02 5.250000000000000000e+02
6.560000000000000000e+02 4.140000000000000000e+02
7.650000000000000000e+02 5.210000000000000000e+02
6.840000000000000000e+02 4.150000000000000000e+02
8.400000000000000000e+02 5.190000000000000000e+02
7.260000000000000000e+02 4.160000000000000000e+02
9.120000000000000000e+02 5.140000000000000000e+02
7.600000000000000000e+02 4.170000000000000000e+02

Assuming both your resolutions are 1280x720 I calculated the left camera rotation and translation.

left_obj = np.array([[
        [0, 0, 0],   
        [0, 3, 0],   
        [0.5, 0, 0], 
        [0.5, 3, 0], 
        [1, 0, 0],  
        [1 ,3, 0], 
        [1.5, 0, 0], 
        [1.5, 3, 0], 
        [2, 0, 0],   
        [2, 3, 0] 
    ]], dtype=np.float32)

left_img = np.array([[
        [5.180000000000000000e+02, 5.920000000000000000e+02],
        [5.480000000000000000e+02, 4.410000000000000000e+02],
        [6.360000000000000000e+02, 5.910000000000000000e+02],
        [6.020000000000000000e+02, 4.420000000000000000e+02],
        [7.520000000000000000e+02, 5.860000000000000000e+02],
        [6.500000000000000000e+02, 4.430000000000000000e+02],
        [8.620000000000000000e+02, 5.770000000000000000e+02],
        [7.000000000000000000e+02, 4.430000000000000000e+02],
        [9.600000000000000000e+02, 5.670000000000000000e+02],
        [7.460000000000000000e+02, 4.430000000000000000e+02]
    ]], dtype=np.float32)

left_camera_matrix = np.array([
        [4.777926320579549042e+02, 0.000000000000000000e+00, 5.609694925007885331e+02],
        [0.000000000000000000e+00, 2.687583555325996372e+02, 5.712247987054799978e+02],
        [0.000000000000000000e+00, 0.000000000000000000e+00, 1.000000000000000000e+00]
    ])


left_distortion_coeffs = np.array([
        -8.332059138465927606e-02,
        -1.402986394998156472e+00,
        2.843132503678651168e-02, 
        7.633417606366312003e-02, 
        1.191317644548635979e+00
    ])

ret, left_camera_matrix, left_distortion_coeffs, rot, trans = cv2.calibrateCamera(left_obj, left_img, (1280, 720),
            left_camera_matrix, left_distortion_coeffs, None, None, cv2.CALIB_USE_INTRINSIC_GUESS)
print(rot[0])
print(trans[0])

I got different results:

[[ 2.7262137 ] [-0.19060341] [-0.30345874]]

[[-0.48068581] [ 0.75257108] [ 1.80413094]]

The same for right camera:

[[ 2.1952522 ] [ 0.20281459] [-0.46649734]]

[[-2.96484428] [-0.0906817 ] [ 3.84203022]]

You can check rotations approximately this way: calculate relative rotation between computed results and compare against relative rotation between real camera positions. Translations: calculate relative normalized translation vector between computed results and compare against normalized relative translation between real camera positions. What coordinate system OpenCV uses is depicted here .

The technical post webpages of this site follow the CC BY-SA 4.0 protocol. If you need to reprint, please indicate the site URL or the original address.Any question please contact:yoyou2525@163.com.

 
粤ICP备18138465号  © 2020-2024 STACKOOM.COM