The MannequinChallenge data can be downloaded here: MannequinChallenge.tar (40MB)
The data consists of a set of .txt files, one for each video clip, specifying timestamps and poses for frames in that clip.
This data is licensed by Google LLC under a Creative Commons Attribution 4.0 International License.
The data is split into train, validation and test subdirectories, each with a set of .txt files, one .txt file for each video clip. The format of each .txt file is as follows:
where each frame line has the following 19 columns:
The camera intrinsics can be organized into a 3x3 matrix K and the camera pose parameters into a 3x4 matrix P = [ R | t ], such that the matrix KP maps a (homogeneous) 3D point p in a world coordinate frame to a (homogeneous) 2D point in the image.
The camera intrinsics are expressed in resolution-independent normalized image coordinates, where the top left corner of the image is (0,0), and the bottom right corner of the image is (1,1). This allows for the intrinsic parameters to be applied to frames at whatever resolution they are represented on disk (or resized to prior to training), by scaling them according to the image size in pixels. For an image of resolution width x height pixels, the intrinsics matrix at the actual scale of the image is
width · focal_length_x | 0 | width · principal_point_x |
0 | height · focal_length_y | height · principal_point_y |
0 | 0 | 1 |