Home Download Team

Download (v1.0)

The MannequinChallenge data can be downloaded here: MannequinChallenge.tar (40MB)

The data consists of a set of .txt files, one for each video clip, specifying timestamps and poses for frames in that clip.

This data is licensed by Google LLC under a Creative Commons Attribution 4.0 International License.

Dataset Design

The data is split into train, validation and test subdirectories, each with a set of .txt files, one .txt file for each video clip. The format of each .txt file is as follows:

<Video URL>

where each frame line has the following 19 columns:

   1. timestamp
      int: microseconds since start of video

 2-7. camera intrinsics
      float: focal_length_x, focal_length_y,
      principal_point_x, principal_point_y,
      two radial distortion coefficients (always 0.0)

8-19. camera pose
      floats forming 3x4 matrix in row-major order

The camera intrinsics can be organized into a 3x3 matrix K and the camera pose parameters into a 3x4 matrix P = [ R | t ], such that the matrix KP maps a (homogeneous) 3D point p in a world coordinate frame to a (homogeneous) 2D point in the image.

The camera intrinsics are expressed in resolution-independent normalized image coordinates, where the top left corner of the image is (0,0), and the bottom right corner of the image is (1,1). This allows for the intrinsic parameters to be applied to frames at whatever resolution they are represented on disk (or resized to prior to training), by scaling them according to the image size in pixels. For an image of resolution width x height pixels, the intrinsics matrix at the actual scale of the image is

K =
width · focal_length_x 0 width · principal_point_x
0 height · focal_length_y height · principal_point_y
0 0 1
Google About Google Privacy Terms