CT images of 80 C57BL/6N mice are acquired using a Siemens Inveon microPET/CT (Knoxville, TN). Animals underwent CT scans with the following settings: Total rotation of 220 degrees with 1 degree steps after 20 dark/light calibrations. The transaxial and axial field of view were 58.44 and 92.04 mm respectively. Exposure time was 800 ms with a binning factor of 2, the effective pixel size was 45.65 μm. The Voltage and current settings were 80 kV and 500 μA respectively. Total scan time per animal was estimated as 1010 seconds. CT images used the common cone-beam reconstruction method, included Houndsfield unit calibration, bilinear interpolation and a Hamming reconstruction filter. Reconstructed CT images are converted to DICOM using VivoQuant software.
Videos of a Diversity Outbred strain of mice that have a range of weights (approximately 20g to 60g), sexes (female or male), ages (1 to 3 years), and coat colors (albino, black, agouti) are captured from a single camera (Vium) at 24 frames per second. From this diverse collection of videos, we manually select 455 video clips where the animals perform one of the following behaviors: standing, drinking, eating, grooming, sleeping, walking or running on the wheel. Each clip is 0.5 seconds long and sampled at 24HZ. Activities are manually labeled by the researchers by watching the clip and surrounding context. The keypoints of the mouse in each of 12 frames from each clip are annotated by trained animal technicians.
![]() |
Keypoint Name |
---|
NOSE |
NECK |
LEFT EAR |
RIGHT EAR |
LEFT SHOULDER |
RIGHT SHOULDER |
LEFT FORE_PAW |
RIGHT FORE_PAW |
LEFT HIP |
RIGHT HIP |
LEFT HIND_PAW |
RIGHT HIND_PAW |
ROOT TAIL |
MID TAIL |
TIP TAIL |
SPINE MID |
SPINE LOWER |
LEFT KNEE |
RIGHT KNEE |
LEFT ELBOW |
RIGHT ELBOW |
SPINE UPPER |
Joint | Label | Parent Label |
---|---|---|
SPINE UPPER | 0 | 0 |
SPINE MID | 1 | 0 |
SPINE LOWER | 2 | 1 |
LEFT HIP | 3 | 2 |
LEFT KNEE | 4 | 3 |
LEFT ANKLE | 5 | 4 |
RIGHT HIP | 6 | 2 |
RIGHT KNEE | 7 | 6 |
RIGHT ANKLE | 8 | 7 |
ROOT TAIL | 9 | 2 |
NECK | 10 | 0 |
NOSE | 11 | 10 |
LEFT SHOULDER | 12 | 0 |
LEFT ELBOW | 13 | 12 |
LEFT WRIST | 14 | 13 |
RIGHT SHOULDER | 15 | 0 |
RIGHT ELBOW | 16 | 15 |
RIGHT WRIST | 17 | 16 |
The Multiple-view video data is 35 consecutive frames of a single C57BL/6N mouse in a custom capture rig, which consists of a top-down RGB+Depth camera (Kinect) and two side RGB cameras with synchronized timing. The cameras are calibrated with overlapping fields of view. We label the keypoints in synchronized frames from each view and triangulate the 3D location of each keypoint that minimizes the reprojection errors.
If you are interested in accessing the dataset, please fill out the following form.