mediapy

mediapy: Read/write/show images and videos in an IPython/Jupyter notebook.

[GitHub source]   [API docs]   [PyPI package]   [Colab example]

See the example notebook, or better yet, open it in Colab.

Image examples

Display an image (2D or 3D numpy array):

checkerboard = np.kron([[0, 1] * 16, [1, 0] * 16] * 16, np.ones((4, 4)))
show_image(checkerboard)

Read and display an image (either local or from the Web):

IMAGE = 'https://github.com/hhoppe/data/raw/main/image.png'
show_image(read_image(IMAGE))

Read and display an image from a local file:

!wget -q -O /tmp/burano.png {IMAGE}
show_image(read_image('/tmp/burano.png'))

Show titled images side-by-side:

images = {
    'original': checkerboard,
    'darkened': checkerboard * 0.7,
    'random': np.random.rand(32, 32, 3),
}
show_images(images, vmin=0.0, vmax=1.0, border=True, height=64)

Compare two images using an interactive slider:

compare_images([checkerboard, np.random.rand(128, 128, 3)])

Video examples

Display a video (an iterable of images, e.g., a 3D or 4D array):

video = moving_circle((100, 100), num_images=10)
show_video(video, fps=10)

Show the video frames side-by-side:

show_images(video, columns=6, border=True, height=64)

Show the frames with their indices:

show_images({f'{i}': image for i, image in enumerate(video)}, width=32)

Read and display a video (either local or from the Web):

VIDEO = 'https://github.com/hhoppe/data/raw/main/video.mp4'
show_video(read_video(VIDEO))

Create and display a looping two-frame GIF video:

image1 = resize_image(np.random.rand(10, 10, 3), (50, 50))
show_video([image1, image1 * 0.8], fps=2, codec='gif')

Darken a video frame-by-frame:

output_path = '/tmp/out.mp4'
with VideoReader(VIDEO) as r:
  darken_image = lambda image: to_float01(image) * 0.5
  with VideoWriter(output_path, shape=r.shape, fps=r.fps, bps=r.bps) as w:
    for image in r:
      w.add_image(darken_image(image))
   1# Copyright 2026 The mediapy Authors.
   2#
   3# Licensed under the Apache License, Version 2.0 (the "License");
   4# you may not use this file except in compliance with the License.
   5# You may obtain a copy of the License at
   6#
   7#     http://www.apache.org/licenses/LICENSE-2.0
   8#
   9# Unless required by applicable law or agreed to in writing, software
  10# distributed under the License is distributed on an "AS IS" BASIS,
  11# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  12# See the License for the specific language governing permissions and
  13# limitations under the License.
  14
  15"""`mediapy`: Read/write/show images and videos in an IPython/Jupyter notebook.
  16
  17[**[GitHub source]**](https://github.com/google/mediapy)  
  18[**[API docs]**](https://google.github.io/mediapy/)  
  19[**[PyPI package]**](https://pypi.org/project/mediapy/)  
  20[**[Colab
  21example]**](https://colab.research.google.com/github/google/mediapy/blob/main/mediapy_examples.ipynb)
  22
  23See the [example
  24notebook](https://github.com/google/mediapy/blob/main/mediapy_examples.ipynb),
  25or better yet, [**open it in
  26Colab**](https://colab.research.google.com/github/google/mediapy/blob/main/mediapy_examples.ipynb).
  27
  28## Image examples
  29
  30Display an image (2D or 3D `numpy` array):
  31```python
  32checkerboard = np.kron([[0, 1] * 16, [1, 0] * 16] * 16, np.ones((4, 4)))
  33show_image(checkerboard)
  34```
  35
  36Read and display an image (either local or from the Web):
  37```python
  38IMAGE = 'https://github.com/hhoppe/data/raw/main/image.png'
  39show_image(read_image(IMAGE))
  40```
  41
  42Read and display an image from a local file:
  43```python
  44!wget -q -O /tmp/burano.png {IMAGE}
  45show_image(read_image('/tmp/burano.png'))
  46```
  47
  48Show titled images side-by-side:
  49```python
  50images = {
  51    'original': checkerboard,
  52    'darkened': checkerboard * 0.7,
  53    'random': np.random.rand(32, 32, 3),
  54}
  55show_images(images, vmin=0.0, vmax=1.0, border=True, height=64)
  56```
  57
  58Compare two images using an interactive slider:
  59```python
  60compare_images([checkerboard, np.random.rand(128, 128, 3)])
  61```
  62
  63## Video examples
  64
  65Display a video (an iterable of images, e.g., a 3D or 4D array):
  66```python
  67video = moving_circle((100, 100), num_images=10)
  68show_video(video, fps=10)
  69```
  70
  71Show the video frames side-by-side:
  72```python
  73show_images(video, columns=6, border=True, height=64)
  74```
  75
  76Show the frames with their indices:
  77```python
  78show_images({f'{i}': image for i, image in enumerate(video)}, width=32)
  79```
  80
  81Read and display a video (either local or from the Web):
  82```python
  83VIDEO = 'https://github.com/hhoppe/data/raw/main/video.mp4'
  84show_video(read_video(VIDEO))
  85```
  86
  87Create and display a looping two-frame GIF video:
  88```python
  89image1 = resize_image(np.random.rand(10, 10, 3), (50, 50))
  90show_video([image1, image1 * 0.8], fps=2, codec='gif')
  91```
  92
  93Darken a video frame-by-frame:
  94```python
  95output_path = '/tmp/out.mp4'
  96with VideoReader(VIDEO) as r:
  97  darken_image = lambda image: to_float01(image) * 0.5
  98  with VideoWriter(output_path, shape=r.shape, fps=r.fps, bps=r.bps) as w:
  99    for image in r:
 100      w.add_image(darken_image(image))
 101```
 102"""
 103
 104from __future__ import annotations
 105
 106__docformat__ = 'google'
 107__version__ = '1.2.6'
 108__version_info__ = tuple(int(num) for num in __version__.split('.'))
 109
 110import base64
 111from collections.abc import Callable, Iterable, Iterator, Mapping, Sequence
 112import contextlib
 113import functools
 114import importlib
 115import io
 116import itertools
 117import math
 118import numbers
 119import os  # Package only needed for typing.TYPE_CHECKING.
 120import pathlib
 121import re
 122import shlex
 123import shutil
 124import subprocess
 125import sys
 126import tempfile
 127import typing
 128from typing import Any
 129import urllib.request
 130import warnings
 131
 132import IPython.display
 133import matplotlib.pyplot
 134import numpy as np
 135import numpy.typing as npt
 136import PIL.Image
 137import PIL.ImageOps
 138
 139
 140if not hasattr(PIL.Image, 'Resampling'):  # Allow Pillow<9.0.
 141  PIL.Image.Resampling = PIL.Image  # type: ignore
 142
 143# Selected and reordered here for pdoc documentation.
 144__all__ = [
 145    'show_image',
 146    'show_images',
 147    'compare_images',
 148    'show_video',
 149    'show_videos',
 150    'read_image',
 151    'write_image',
 152    'read_video',
 153    'write_video',
 154    'VideoReader',
 155    'VideoWriter',
 156    'VideoMetadata',
 157    'compress_image',
 158    'decompress_image',
 159    'compress_video',
 160    'decompress_video',
 161    'html_from_compressed_image',
 162    'html_from_compressed_video',
 163    'resize_image',
 164    'resize_video',
 165    'to_rgb',
 166    'to_type',
 167    'to_float01',
 168    'to_uint8',
 169    'set_output_height',
 170    'set_max_output_height',
 171    'color_ramp',
 172    'moving_circle',
 173    'set_show_save_dir',
 174    'set_ffmpeg',
 175    'video_is_available',
 176]
 177
 178if TYPE_CHECKING:
 179  _ArrayLike = npt.ArrayLike
 180  _DTypeLike = npt.DTypeLike
 181  _NDArray = npt.NDArray[Any]
 182  _DType = np.dtype[Any]
 183else:
 184  # Create named types for use in the `pdoc` documentation.
 185  _ArrayLike = TypeVar('_ArrayLike')
 186  _DTypeLike = TypeVar('_DTypeLike')
 187  _NDArray = TypeVar('_NDArray')
 188  _DType = TypeVar('_DType')  # pylint: disable=invalid-name
 189
 190_IPYTHON_HTML_SIZE_LIMIT = 10**10  # Unlimited seems to be OK now.
 191_T = TypeVar('_T')
 192_Path = Union[str, 'os.PathLike[str]']
 193
 194_IMAGE_COMPARISON_HTML = """\
 195<script
 196  defer
 197  src="https://unpkg.com/img-comparison-slider@7/dist/index.js"
 198></script>
 199<link
 200  rel="stylesheet"
 201  href="https://unpkg.com/img-comparison-slider@7/dist/styles.css"
 202/>
 203
 204<img-comparison-slider>
 205  <img slot="first" src="data:image/png;base64,{b64_1}" />
 206  <img slot="second" src="data:image/png;base64,{b64_2}" />
 207</img-comparison-slider>
 208"""
 209
 210# ** Miscellaneous.
 211
 212
 213class _Config:
 214  ffmpeg_name_or_path: _Path = 'ffmpeg'
 215  show_save_dir: _Path | None = None
 216
 217
 218_config = _Config()
 219
 220
 221def _open(path: _Path, *args: Any, **kwargs: Any) -> Any:
 222  """Opens the file; this is a hook for the built-in `open()`."""
 223  return open(path, *args, **kwargs)
 224
 225
 226def _path_is_local(path: _Path) -> bool:
 227  """Returns True if the path is in the filesystem accessible by `ffmpeg`."""
 228  del path
 229  return True
 230
 231
 232def _search_for_ffmpeg_path() -> str | None:
 233  """Returns a path to the ffmpeg program, or None if not found."""
 234  if filename := shutil.which(_config.ffmpeg_name_or_path):
 235    return str(filename)
 236  return None
 237
 238
 239def _print_err(*args: str, **kwargs: Any) -> None:
 240  """Prints arguments to stderr immediately."""
 241  kwargs = {**dict(file=sys.stderr, flush=True), **kwargs}
 242  print(*args, **kwargs)
 243
 244
 245def _chunked(
 246    iterable: Iterable[_T], n: int | None = None
 247) -> Iterator[tuple[_T, ...]]:
 248  """Returns elements collected as tuples of length at most `n` if not None."""
 249
 250  def take(n: int | None, iterable: Iterable[_T]) -> tuple[_T, ...]:
 251    return tuple(itertools.islice(iterable, n))
 252
 253  return iter(functools.partial(take, n, iter(iterable)), ())
 254
 255
 256def _peek_first(iterator: Iterable[_T]) -> tuple[_T, Iterable[_T]]:
 257  """Given an iterator, returns first element and re-initialized iterator.
 258
 259  >>> first_image, images = _peek_first(moving_circle())
 260
 261  Args:
 262    iterator: An input iterator or iterable.
 263
 264  Returns:
 265    A tuple (first_element, iterator_reinitialized) containing:
 266      first_element: The first element of the input.
 267      iterator_reinitialized: A clone of the original iterator/iterable.
 268  """
 269  # Inspired from https://stackoverflow.com/a/12059829/1190077
 270  peeker, iterator_reinitialized = itertools.tee(iterator)
 271  first = next(peeker)
 272  return first, iterator_reinitialized
 273
 274
 275def _check_2d_shape(shape: tuple[int, int]) -> None:
 276  """Checks that `shape` is of the form (height, width) with two integers."""
 277  if len(shape) != 2:
 278    raise ValueError(f'Shape {shape} is not of the form (height, width).')
 279  if not all(isinstance(i, numbers.Integral) for i in shape):
 280    raise ValueError(f'Shape {shape} contains non-integers.')
 281
 282
 283def _run(args: str | Sequence[str]) -> None:
 284  """Executes command, printing output from stdout and stderr.
 285
 286  Args:
 287    args: Command to execute, which can be either a string or a sequence of word
 288      strings, as in `subprocess.run()`.  If `args` is a string, the shell is
 289      invoked to interpret it.
 290
 291  Raises:
 292    RuntimeError: If the command's exit code is nonzero.
 293  """
 294  proc = subprocess.run(
 295      args,
 296      shell=isinstance(args, str),
 297      stdout=subprocess.PIPE,
 298      stderr=subprocess.STDOUT,
 299      check=False,
 300      universal_newlines=True,
 301  )
 302  print(proc.stdout, end='', flush=True)
 303  if proc.returncode:
 304    raise RuntimeError(
 305        f"Command '{proc.args}' failed with code {proc.returncode}."
 306    )
 307
 308
 309def _display_html(text: str, /) -> None:
 310  """In a Jupyter notebook, display the HTML `text`."""
 311  IPython.display.display(IPython.display.HTML(text))  # type: ignore
 312
 313
 314def set_ffmpeg(name_or_path: _Path) -> None:
 315  """Specifies the name or path for the `ffmpeg` external program.
 316
 317  The `ffmpeg` program is required for compressing and decompressing video.
 318  (It is used in `read_video`, `write_video`, `show_video`, `show_videos`,
 319  etc.)
 320
 321  Args:
 322    name_or_path: Either a filename within a directory of `os.environ['PATH']`
 323      or a filepath.  The default setting is 'ffmpeg'.
 324  """
 325  _config.ffmpeg_name_or_path = name_or_path
 326
 327
 328def set_output_height(num_pixels: int) -> None:
 329  """Overrides the height of the current output cell, if using Colab."""
 330  try:
 331    # We want to fail gracefully for non-Colab IPython notebooks.
 332    output = importlib.import_module('google.colab.output')
 333    s = f'google.colab.output.setIframeHeight("{num_pixels}px")'
 334    output.eval_js(s)
 335  except (ModuleNotFoundError, AttributeError):
 336    pass
 337
 338
 339def set_max_output_height(num_pixels: int) -> None:
 340  """Sets the maximum height of the current output cell, if using Colab."""
 341  try:
 342    # We want to fail gracefully for non-Colab IPython notebooks.
 343    output = importlib.import_module('google.colab.output')
 344    s = (
 345        'google.colab.output.setIframeHeight('
 346        f'0, true, {{maxHeight: {num_pixels}}})'
 347    )
 348    output.eval_js(s)
 349  except (ModuleNotFoundError, AttributeError):
 350    pass
 351
 352
 353# ** Type conversions.
 354
 355
 356def _as_valid_media_type(dtype: _DTypeLike) -> _DType:
 357  """Returns validated media data type."""
 358  dtype = np.dtype(dtype)
 359  if not issubclass(dtype.type, (np.unsignedinteger, np.floating)):
 360    raise ValueError(
 361        f'Type {dtype} is not a valid media data type (uint or float).'
 362    )
 363  return dtype
 364
 365
 366def _as_valid_media_array(x: _ArrayLike) -> _NDArray:
 367  """Converts to ndarray (if not already), and checks validity of data type."""
 368  a = np.asarray(x)
 369  if a.dtype == bool:
 370    a = a.astype(np.uint8) * np.iinfo(np.uint8).max
 371  _as_valid_media_type(a.dtype)
 372  return a
 373
 374
 375def to_type(array: _ArrayLike, dtype: _DTypeLike) -> _NDArray:
 376  """Returns media array converted to specified type.
 377
 378  A "media array" is one in which the dtype is either a floating-point type
 379  (np.float32 or np.float64) or an unsigned integer type.  The array values are
 380  assumed to lie in the range [0.0, 1.0] for floating-point values, and in the
 381  full range for unsigned integers, e.g. [0, 255] for np.uint8.
 382
 383  Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to
 384  1.0.  The input array may also be of type bool, whereby True maps to
 385  uint(MAX) or 1.0.  The values are scaled and clamped as appropriate during
 386  type conversions.
 387
 388  Args:
 389    array: Input array-like object (floating-point, unsigned int, or bool).
 390    dtype: Desired output type (floating-point or unsigned int).
 391
 392  Returns:
 393    Array `a` if it is already of the specified dtype, else a converted array.
 394  """
 395  a = np.asarray(array)
 396  dtype = np.dtype(dtype)
 397  del array
 398  if a.dtype != bool:
 399    _as_valid_media_type(a.dtype)  # Verify that 'a' has a valid dtype.
 400  if a.dtype == bool:
 401    result = a.astype(dtype)
 402    if np.issubdtype(dtype, np.unsignedinteger):
 403      result = result * dtype.type(np.iinfo(dtype).max)
 404  elif a.dtype == dtype:
 405    result = a
 406  elif np.issubdtype(dtype, np.unsignedinteger):
 407    if np.issubdtype(a.dtype, np.unsignedinteger):
 408      src_max: float = np.iinfo(a.dtype).max
 409    else:
 410      a = np.clip(a, 0.0, 1.0)
 411      src_max = 1.0
 412    dst_max = np.iinfo(dtype).max
 413    if dst_max <= np.iinfo(np.uint16).max:
 414      scale = np.array(dst_max / src_max, dtype=np.float32)
 415      result = (a * scale + 0.5).astype(dtype)
 416    elif dst_max <= np.iinfo(np.uint32).max:
 417      result = (a.astype(np.float64) * (dst_max / src_max) + 0.5).astype(dtype)
 418    else:
 419      # https://stackoverflow.com/a/66306123/
 420      a = a.astype(np.float64) * (dst_max / src_max) + 0.5
 421      dst = np.atleast_1d(a)
 422      values_too_large = dst >= np.float64(dst_max)
 423      with np.errstate(invalid='ignore'):
 424        dst = dst.astype(dtype)
 425      dst[values_too_large] = dst_max
 426      result = dst if a.ndim > 0 else dst[0]
 427  else:
 428    assert np.issubdtype(dtype, np.floating)
 429    result = a.astype(dtype)
 430    if np.issubdtype(a.dtype, np.unsignedinteger):
 431      result = result / dtype.type(np.iinfo(a.dtype).max)
 432  return result
 433
 434
 435def to_float01(a: _ArrayLike, dtype: _DTypeLike = np.float32) -> _NDArray:
 436  """If array has unsigned integers, rescales them to the range [0.0, 1.0].
 437
 438  Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0.  See
 439  `to_type`.
 440
 441  Args:
 442    a: Input array.
 443    dtype: Desired floating-point type if rescaling occurs.
 444
 445  Returns:
 446    A new array of dtype values in the range [0.0, 1.0] if the input array `a`
 447    contains unsigned integers; otherwise, array `a` is returned unchanged.
 448  """
 449  a = np.asarray(a)
 450  dtype = np.dtype(dtype)
 451  if not np.issubdtype(dtype, np.floating):
 452    raise ValueError(f'Type {dtype} is not floating-point.')
 453  if np.issubdtype(a.dtype, np.floating):
 454    return a
 455  return to_type(a, dtype)
 456
 457
 458def to_uint8(a: _ArrayLike) -> _NDArray:
 459  """Returns array converted to uint8 values; see `to_type`."""
 460  return to_type(a, np.uint8)
 461
 462
 463# ** Functions to generate example image and video data.
 464
 465
 466def color_ramp(
 467    shape: tuple[int, int] = (64, 64), *, dtype: _DTypeLike = np.float32
 468) -> _NDArray:
 469  """Returns an image of a red-green color gradient.
 470
 471  This is useful for quick experimentation and testing.  See also
 472  `moving_circle` to generate a sample video.
 473
 474  Args:
 475    shape: 2D spatial dimensions (height, width) of generated image.
 476    dtype: Type (uint or floating) of resulting pixel values.
 477  """
 478  _check_2d_shape(shape)
 479  dtype = _as_valid_media_type(dtype)
 480  yx = (np.moveaxis(np.indices(shape), 0, -1) + 0.5) / shape
 481  image = np.insert(yx, 2, 0.0, axis=-1)
 482  return to_type(image, dtype)
 483
 484
 485def moving_circle(
 486    shape: tuple[int, int] = (256, 256),
 487    num_images: int = 10,
 488    *,
 489    dtype: _DTypeLike = np.float32,
 490) -> _NDArray:
 491  """Returns a video of a circle moving in front of a color ramp.
 492
 493  This is useful for quick experimentation and testing.  See also `color_ramp`
 494  to generate a sample image.
 495
 496  >>> show_video(moving_circle((480, 640), 60), fps=60)
 497
 498  Args:
 499    shape: 2D spatial dimensions (height, width) of generated video.
 500    num_images: Number of video frames.
 501    dtype: Type (uint or floating) of resulting pixel values.
 502  """
 503  _check_2d_shape(shape)
 504  dtype = np.dtype(dtype)
 505
 506  def generate_image(image_index: int) -> _NDArray:
 507    """Returns a video frame image."""
 508    image = color_ramp(shape, dtype=dtype)
 509    yx = np.moveaxis(np.indices(shape), 0, -1)
 510    center = shape[0] * 0.6, shape[1] * (image_index + 0.5) / num_images
 511    radius_squared = (min(shape) * 0.1) ** 2
 512    inside = np.sum((yx - center) ** 2, axis=-1) < radius_squared
 513    white_circle_color = 1.0, 1.0, 1.0
 514    if np.issubdtype(dtype, np.unsignedinteger):
 515      white_circle_color = to_type([white_circle_color], dtype)[0]
 516    image[inside] = white_circle_color
 517    return image
 518
 519  return np.array([generate_image(i) for i in range(num_images)])
 520
 521
 522# ** Color-space conversions.
 523
 524# Same matrix values as in two sources:
 525# https://github.com/scikit-image/scikit-image/blob/master/skimage/color/colorconv.py#L377
 526# https://github.com/tensorflow/tensorflow/blob/r1.14/tensorflow/python/ops/image_ops_impl.py#L2754
 527_YUV_FROM_RGB_MATRIX = np.array(
 528    [
 529        [0.299, -0.14714119, 0.61497538],
 530        [0.587, -0.28886916, -0.51496512],
 531        [0.114, 0.43601035, -0.10001026],
 532    ],
 533    dtype=np.float32,
 534)
 535_RGB_FROM_YUV_MATRIX = np.linalg.inv(_YUV_FROM_RGB_MATRIX)
 536_YUV_CHROMA_OFFSET = np.array([0.0, 0.5, 0.5], dtype=np.float32)
 537
 538
 539def yuv_from_rgb(rgb: _ArrayLike) -> _NDArray:
 540  """Returns the RGB image/video mapped to YUV [0,1] color space.
 541
 542  Note that the "YUV" color space used by video compressors is actually YCbCr!
 543
 544  Args:
 545    rgb: Input image in sRGB space.
 546  """
 547  rgb = to_float01(rgb)
 548  if rgb.shape[-1] != 3:
 549    raise ValueError(f'The last dimension in {rgb.shape} is not 3.')
 550  return rgb @ _YUV_FROM_RGB_MATRIX + _YUV_CHROMA_OFFSET
 551
 552
 553def rgb_from_yuv(yuv: _ArrayLike) -> _NDArray:
 554  """Returns the YUV image/video mapped to RGB [0,1] color space."""
 555  yuv = to_float01(yuv)
 556  if yuv.shape[-1] != 3:
 557    raise ValueError(f'The last dimension in {yuv.shape} is not 3.')
 558  return (yuv - _YUV_CHROMA_OFFSET) @ _RGB_FROM_YUV_MATRIX
 559
 560
 561# Same matrix values as in
 562# https://github.com/scikit-image/scikit-image/blob/master/skimage/color/colorconv.py#L1654
 563# and https://en.wikipedia.org/wiki/YUV#Studio_swing_for_BT.601
 564_YCBCR_FROM_RGB_MATRIX = np.array(
 565    [
 566        [65.481, 128.553, 24.966],
 567        [-37.797, -74.203, 112.0],
 568        [112.0, -93.786, -18.214],
 569    ],
 570    dtype=np.float32,
 571).transpose()
 572_RGB_FROM_YCBCR_MATRIX = np.linalg.inv(_YCBCR_FROM_RGB_MATRIX)
 573_YCBCR_OFFSET = np.array([16.0, 128.0, 128.0], dtype=np.float32)
 574# Note that _YCBCR_FROM_RGB_MATRIX =~ _YUV_FROM_RGB_MATRIX * [219, 256, 182];
 575# https://en.wikipedia.org/wiki/YUV: "Y' values are conventionally shifted and
 576# scaled to the range [16, 235] (referred to as studio swing or 'TV levels')";
 577# "studio range of 16-240 for U and V".  (Where does value 182 come from?)
 578
 579
 580def ycbcr_from_rgb(rgb: _ArrayLike) -> _NDArray:
 581  """Returns the RGB image/video mapped to YCbCr [0,1] color space.
 582
 583  The YCbCr color space is the one called "YUV" by video compressors.
 584
 585  Args:
 586    rgb: Input image in sRGB space.
 587  """
 588  rgb = to_float01(rgb)
 589  if rgb.shape[-1] != 3:
 590    raise ValueError(f'The last dimension in {rgb.shape} is not 3.')
 591  return (rgb @ _YCBCR_FROM_RGB_MATRIX + _YCBCR_OFFSET) / 255.0
 592
 593
 594def rgb_from_ycbcr(ycbcr: _ArrayLike) -> _NDArray:
 595  """Returns the YCbCr image/video mapped to RGB [0,1] color space."""
 596  ycbcr = to_float01(ycbcr)
 597  if ycbcr.shape[-1] != 3:
 598    raise ValueError(f'The last dimension in {ycbcr.shape} is not 3.')
 599  return (ycbcr * 255.0 - _YCBCR_OFFSET) @ _RGB_FROM_YCBCR_MATRIX
 600
 601
 602# ** Image processing.
 603
 604
 605def _pil_image(image: _ArrayLike, mode: str | None = None) -> PIL.Image.Image:
 606  """Returns a PIL image given a numpy matrix (either uint8 or float [0,1])."""
 607  image = _as_valid_media_array(image)
 608  if image.ndim not in (2, 3):
 609    raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.')
 610  pil_image: PIL.Image.Image = PIL.Image.fromarray(image, mode=mode)
 611  return pil_image
 612
 613
 614def resize_image(image: _ArrayLike, shape: tuple[int, int]) -> _NDArray:
 615  """Resizes image to specified spatial dimensions using a Lanczos filter.
 616
 617  Args:
 618    image: Array-like 2D or 3D object, where dtype is uint or floating-point.
 619    shape: 2D spatial dimensions (height, width) of output image.
 620
 621  Returns:
 622    A resampled image whose spatial dimensions match `shape`.
 623  """
 624  image = _as_valid_media_array(image)
 625  if image.ndim not in (2, 3):
 626    raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.')
 627  _check_2d_shape(shape)
 628
 629  # A PIL image can be multichannel only if it has 3 or 4 uint8 channels,
 630  # and it can be resized only if it is uint8 or float32.
 631  supported_single_channel = (
 632      np.issubdtype(image.dtype, np.floating) or image.dtype == np.uint8
 633  ) and image.ndim == 2
 634  supported_multichannel = (
 635      image.dtype == np.uint8 and image.ndim == 3 and image.shape[2] in (3, 4)
 636  )
 637  if supported_single_channel or supported_multichannel:
 638    return np.array(
 639        _pil_image(image).resize(
 640            shape[::-1], resample=PIL.Image.Resampling.LANCZOS
 641        ),
 642        dtype=image.dtype,
 643    )
 644  if image.ndim == 2:
 645    # We convert to floating-point for resizing and convert back.
 646    return to_type(resize_image(to_float01(image), shape), image.dtype)
 647  # We resize each image channel individually.
 648  return np.dstack(
 649      [resize_image(channel, shape) for channel in np.moveaxis(image, -1, 0)]
 650  )
 651
 652
 653# ** Video processing.
 654
 655
 656def resize_video(video: Iterable[_NDArray], shape: tuple[int, int]) -> _NDArray:
 657  """Resizes `video` to specified spatial dimensions using a Lanczos filter.
 658
 659  Args:
 660    video: Iterable of images.
 661    shape: 2D spatial dimensions (height, width) of output video.
 662
 663  Returns:
 664    A resampled video whose spatial dimensions match `shape`.
 665  """
 666  _check_2d_shape(shape)
 667  return np.array([resize_image(image, shape) for image in video])
 668
 669
 670# ** General I/O.
 671
 672
 673def _is_url(path_or_url: _Path) -> bool:
 674  return isinstance(path_or_url, str) and path_or_url.startswith(
 675      ('http://', 'https://', 'file://')
 676  )
 677
 678
 679def read_contents(path_or_url: _Path) -> bytes:
 680  """Returns the contents of the file specified by either a path or URL."""
 681  data: bytes
 682  if _is_url(path_or_url):
 683    assert isinstance(path_or_url, str)
 684    headers = {'User-Agent': 'Chrome'}
 685    request = urllib.request.Request(path_or_url, headers=headers)
 686    with urllib.request.urlopen(request) as response:
 687      data = response.read()
 688  else:
 689    with _open(path_or_url, 'rb') as f:
 690      data = f.read()
 691  return data
 692
 693
 694@contextlib.contextmanager
 695def _read_via_local_file(path_or_url: _Path) -> Iterator[str]:
 696  """Context to copy a remote file locally to read from it.
 697
 698  Args:
 699    path_or_url: File, which may be remote.
 700
 701  Yields:
 702    The name of a local file which may be a copy of a remote file.
 703  """
 704  if _is_url(path_or_url) or not _path_is_local(path_or_url):
 705    suffix = pathlib.Path(path_or_url).suffix
 706    with tempfile.TemporaryDirectory() as directory_name:
 707      tmp_path = pathlib.Path(directory_name) / f'file{suffix}'
 708      tmp_path.write_bytes(read_contents(path_or_url))
 709      yield str(tmp_path)
 710  else:
 711    yield str(path_or_url)
 712
 713
 714@contextlib.contextmanager
 715def _write_via_local_file(path: _Path) -> Iterator[str]:
 716  """Context to write a temporary local file and subsequently copy it remotely.
 717
 718  Args:
 719    path: File, which may be remote.
 720
 721  Yields:
 722    The name of a local file which may be subsequently copied remotely.
 723  """
 724  if _path_is_local(path):
 725    yield str(path)
 726  else:
 727    suffix = pathlib.Path(path).suffix
 728    with tempfile.TemporaryDirectory() as directory_name:
 729      tmp_path = pathlib.Path(directory_name) / f'file{suffix}'
 730      yield str(tmp_path)
 731      with _open(path, mode='wb') as f:
 732        f.write(tmp_path.read_bytes())
 733
 734
 735class set_show_save_dir:  # pylint: disable=invalid-name
 736  """Save all titled output from `show_*()` calls into files.
 737
 738  If the specified `directory` is not None, all titled images and videos
 739  displayed by `show_image`, `show_images`, `show_video`, and `show_videos` are
 740  also saved as files within the directory.
 741
 742  It can be used either to set the state or as a context manager:
 743
 744  >>> set_show_save_dir('/tmp')
 745  >>> show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
 746  >>> show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
 747  >>> set_show_save_dir(None)
 748
 749  >>> with set_show_save_dir('/tmp'):
 750  ...   show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
 751  ...   show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
 752  """
 753
 754  def __init__(self, directory: _Path | None):
 755    self._old_show_save_dir = _config.show_save_dir
 756    _config.show_save_dir = directory
 757
 758  def __enter__(self) -> None:
 759    pass
 760
 761  def __exit__(self, *_: Any) -> None:
 762    _config.show_save_dir = self._old_show_save_dir
 763
 764
 765# ** Image I/O.
 766
 767
 768def read_image(
 769    path_or_url: _Path,
 770    *,
 771    apply_exif_transpose: bool = True,
 772    dtype: _DTypeLike = None,
 773) -> _NDArray:
 774  """Returns an image read from a file path or URL.
 775
 776  Decoding is performed using `PIL`, which supports `uint8` images with 1, 3,
 777  or 4 channels and `uint16` images with a single channel.
 778
 779  Args:
 780    path_or_url: Path of input file.
 781    apply_exif_transpose: If True, rotate image according to EXIF orientation.
 782    dtype: Data type of the returned array.  If None, `np.uint8` or `np.uint16`
 783      is inferred automatically.
 784  """
 785  data = read_contents(path_or_url)
 786  return decompress_image(data, dtype, apply_exif_transpose)
 787
 788
 789def write_image(
 790    path: _Path, image: _ArrayLike, fmt: str = 'png', **kwargs: Any
 791) -> None:
 792  """Writes an image to a file.
 793
 794  Encoding is performed using `PIL`, which supports `uint8` images with 1, 3,
 795  or 4 channels and `uint16` images with a single channel.
 796
 797  File format is explicitly provided by `fmt` and not inferred by `path`.
 798
 799  Args:
 800    path: Path of output file.
 801    image: Array-like object.  If its type is float, it is converted to np.uint8
 802      using `to_uint8` (thus clamping to the input to the range [0.0, 1.0]).
 803      Otherwise it must be np.uint8 or np.uint16.
 804    fmt: Desired compression encoding, e.g. 'png'.
 805    **kwargs: Additional parameters for `PIL.Image.save()`.
 806  """
 807  image = _as_valid_media_array(image)
 808  if np.issubdtype(image.dtype, np.floating):
 809    image = to_uint8(image)
 810  with _open(path, 'wb') as f:
 811    _pil_image(image).save(f, format=fmt, **kwargs)
 812
 813
 814def to_rgb(
 815    array: _ArrayLike,
 816    *,
 817    vmin: float | None = None,
 818    vmax: float | None = None,
 819    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
 820) -> _NDArray:
 821  """Maps scalar values to RGB using value bounds and a color map.
 822
 823  Args:
 824    array: Scalar values, with arbitrary shape.
 825    vmin: Explicit min value for remapping; if None, it is obtained as the
 826      minimum finite value of `array`.
 827    vmax: Explicit max value for remapping; if None, it is obtained as the
 828      maximum finite value of `array`.
 829    cmap: A `pyplot` color map or callable, to map from 1D value to 3D or 4D
 830      color.
 831
 832  Returns:
 833    A new array in which each element is affinely mapped from [vmin, vmax]
 834    to [0.0, 1.0] and then color-mapped.
 835  """
 836  a = _as_valid_media_array(array)
 837  del array
 838  # For future numpy version 1.7.0:
 839  # vmin = np.amin(a, where=np.isfinite(a)) if vmin is None else vmin
 840  # vmax = np.amax(a, where=np.isfinite(a)) if vmax is None else vmax
 841  vmin = np.amin(np.where(np.isfinite(a), a, np.inf)) if vmin is None else vmin
 842  vmax = np.amax(np.where(np.isfinite(a), a, -np.inf)) if vmax is None else vmax
 843  a = (a.astype('float') - vmin) / (vmax - vmin + np.finfo(float).eps)
 844  if isinstance(cmap, str):
 845    if hasattr(matplotlib, 'colormaps'):
 846      rgb_from_scalar: Any = matplotlib.colormaps[cmap]  # Newer version.
 847    else:
 848      rgb_from_scalar = matplotlib.pyplot.cm.get_cmap(cmap)  # pylint: disable=no-member
 849  else:
 850    rgb_from_scalar = cmap
 851  a = cast(_NDArray, rgb_from_scalar(a))
 852  # If there is a fully opaque alpha channel, remove it.
 853  if a.shape[-1] == 4 and np.all(to_float01(a[..., 3])) == 1.0:
 854    a = a[..., :3]
 855  return a
 856
 857
 858def compress_image(
 859    image: _ArrayLike, *, fmt: str = 'png', **kwargs: Any
 860) -> bytes:
 861  """Returns a buffer containing a compressed image.
 862
 863  Args:
 864    image: Array in a format supported by `PIL`, e.g. np.uint8 or np.uint16.
 865    fmt: Desired compression encoding, e.g. 'png'.
 866    **kwargs: Options for `PIL.save()`, e.g. `optimize=True` for greater
 867      compression.
 868  """
 869  image = _as_valid_media_array(image)
 870  with io.BytesIO() as output:
 871    _pil_image(image).save(output, format=fmt, **kwargs)
 872    return output.getvalue()
 873
 874
 875def decompress_image(
 876    data: bytes, dtype: _DTypeLike = None, apply_exif_transpose: bool = True
 877) -> _NDArray:
 878  """Returns an image from a compressed data buffer.
 879
 880  Decoding is performed using `PIL`, which supports `uint8` images with 1, 3,
 881  or 4 channels and `uint16` images with a single channel.
 882
 883  Args:
 884    data: Buffer containing compressed image.
 885    dtype: Data type of the returned array.  If None, `np.uint8` or `np.uint16`
 886      is inferred automatically.
 887    apply_exif_transpose: If True, rotate image according to EXIF orientation.
 888  """
 889  pil_image: PIL.Image.Image = PIL.Image.open(io.BytesIO(data))
 890  if apply_exif_transpose:
 891    tmp_image = PIL.ImageOps.exif_transpose(pil_image)  # Future: in_place=True.
 892    assert tmp_image
 893    pil_image = tmp_image
 894  if dtype is None:
 895    dtype = np.uint16 if pil_image.mode.startswith('I') else np.uint8
 896  return np.array(pil_image, dtype=dtype)
 897
 898
 899def html_from_compressed_image(
 900    data: bytes,
 901    width: int,
 902    height: int,
 903    *,
 904    title: str | None = None,
 905    border: bool | str = False,
 906    pixelated: bool = True,
 907    fmt: str = 'png',
 908) -> str:
 909  """Returns an HTML string with an image tag containing encoded data.
 910
 911  Args:
 912    data: Compressed image bytes.
 913    width: Width of HTML image in pixels.
 914    height: Height of HTML image in pixels.
 915    title: Optional text shown centered above image.
 916    border: If `bool`, whether to place a black boundary around the image, or if
 917      `str`, the boundary CSS style.
 918    pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'.
 919    fmt: Compression encoding.
 920  """
 921  b64 = base64.b64encode(data).decode('utf-8')
 922  if isinstance(border, str):
 923    border = f'{border}; '
 924  elif border:
 925    border = 'border:1px solid black; '
 926  else:
 927    border = ''
 928  s_pixelated = 'pixelated' if pixelated else 'auto'
 929  s = (
 930      f'<img width="{width}" height="{height}"'
 931      f' style="{border}image-rendering:{s_pixelated}; object-fit:cover;"'
 932      f' src="data:image/{fmt};base64,{b64}"/>'
 933  )
 934  if title is not None:
 935    s = f"""<div style="display:flex; align-items:left;">
 936      <div style="display:flex; flex-direction:column; align-items:center;">
 937      <div>{title}</div><div>{s}</div></div></div>"""
 938  return s
 939
 940
 941def _get_width_height(
 942    width: int | None, height: int | None, shape: tuple[int, int]
 943) -> tuple[int, int]:
 944  """Returns (width, height) given optional parameters and image shape."""
 945  assert len(shape) == 2, shape
 946  if width and height:
 947    return width, height
 948  if width and not height:
 949    return width, int(width * (shape[0] / shape[1]) + 0.5)
 950  if height and not width:
 951    return int(height * (shape[1] / shape[0]) + 0.5), height
 952  return shape[::-1]
 953
 954
 955def _ensure_mapped_to_rgb(
 956    image: _ArrayLike,
 957    *,
 958    vmin: float | None = None,
 959    vmax: float | None = None,
 960    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
 961) -> _NDArray:
 962  """Ensure image is mapped to RGB."""
 963  image = _as_valid_media_array(image)
 964  if not (image.ndim == 2 or (image.ndim == 3 and image.shape[2] in (1, 3, 4))):
 965    raise ValueError(
 966        f'Image with shape {image.shape} is neither a 2D array'
 967        ' nor a 3D array with 1, 3, or 4 channels.'
 968    )
 969  if image.ndim == 3 and image.shape[2] == 1:
 970    image = image[:, :, 0]
 971  if image.ndim == 2:
 972    image = to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
 973  return image
 974
 975
 976def show_image(
 977    image: _ArrayLike, *, title: str | None = None, **kwargs: Any
 978) -> str | None:
 979  """Displays an image in the notebook and optionally saves it to a file.
 980
 981  See `show_images`.
 982
 983  >>> show_image(np.random.rand(100, 100))
 984  >>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8'))
 985  >>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100)
 986  >>> show_image(read_image('/tmp/image.png'))
 987  >>> url = 'https://github.com/hhoppe/data/raw/main/image.png'
 988  >>> show_image(read_image(url))
 989
 990  Args:
 991    image: 2D array-like, or 3D array-like with 1, 3, or 4 channels.
 992    title: Optional text shown centered above the image.
 993    **kwargs: See `show_images`.
 994
 995  Returns:
 996    html string if `return_html` is `True`.
 997  """
 998  return show_images([np.asarray(image)], [title], **kwargs)
 999
1000
1001def show_images(
1002    images: Iterable[_ArrayLike] | Mapping[str, _ArrayLike],
1003    titles: Iterable[str | None] | None = None,
1004    *,
1005    width: int | None = None,
1006    height: int | None = None,
1007    downsample: bool = True,
1008    columns: int | None = None,
1009    vmin: float | None = None,
1010    vmax: float | None = None,
1011    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
1012    border: bool | str = False,
1013    ylabel: str = '',
1014    html_class: str = 'show_images',
1015    pixelated: bool | None = None,
1016    return_html: bool = False,
1017) -> str | None:
1018  """Displays a row of images in the IPython/Jupyter notebook.
1019
1020  If a directory has been specified using `set_show_save_dir`, also saves each
1021  titled image to a file in that directory based on its title.
1022
1023  >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
1024  >>> show_images([image1, image2])
1025  >>> show_images({'random image': image1, 'color ramp': image2}, height=128)
1026  >>> show_images([image1, image2] * 5, columns=4, border=True)
1027
1028  Args:
1029    images: Iterable of images, or dictionary of `{title: image}`.  Each image
1030      must be either a 2D array or a 3D array with 1, 3, or 4 channels.
1031    titles: Optional strings shown above the corresponding images.
1032    width: Optional, overrides displayed width (in pixels).
1033    height: Optional, overrides displayed height (in pixels).
1034    downsample: If True, each image whose width or height is greater than the
1035      specified `width` or `height` is resampled to the display resolution. This
1036      improves antialiasing and reduces the size of the notebook.
1037    columns: Optional, maximum number of images per row.
1038    vmin: For single-channel image, explicit min value for display.
1039    vmax: For single-channel image, explicit max value for display.
1040    cmap: For single-channel image, `pyplot` color map or callable to map 1D to
1041      3D color.
1042    border: If `bool`, whether to place a black boundary around the image, or if
1043      `str`, the boundary CSS style.
1044    ylabel: Text (rotated by 90 degrees) shown on the left of each row.
1045    html_class: CSS class name used in definition of HTML element.
1046    pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if
1047      False, sets 'image-rendering: auto'; if None, uses pixelated rendering
1048      only on images for which `width` or `height` introduces magnification.
1049    return_html: If `True` return the raw HTML `str` instead of displaying.
1050
1051  Returns:
1052    html string if `return_html` is `True`.
1053  """
1054  if isinstance(images, Mapping):
1055    if titles is not None:
1056      raise ValueError('Cannot have images dictionary and titles parameter.')
1057    list_titles, list_images = list(images.keys()), list(images.values())
1058  else:
1059    list_images = list(images)
1060    list_titles = [None] * len(list_images) if titles is None else list(titles)
1061    if len(list_images) != len(list_titles):
1062      raise ValueError(
1063          'Number of images does not match number of titles'
1064          f' ({len(list_images)} vs {len(list_titles)}).'
1065      )
1066
1067  list_images = [
1068      _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
1069      for image in list_images
1070  ]
1071
1072  def maybe_downsample(image: _NDArray) -> _NDArray:
1073    shape = image.shape[0], image.shape[1]
1074    w, h = _get_width_height(width, height, shape)
1075    if w < shape[1] or h < shape[0]:
1076      image = resize_image(image, (h, w))
1077    return image
1078
1079  if downsample:
1080    list_images = [maybe_downsample(image) for image in list_images]
1081  png_datas = [compress_image(to_uint8(image)) for image in list_images]
1082
1083  for title, png_data in zip(list_titles, png_datas):
1084    if title is not None and _config.show_save_dir:
1085      path = pathlib.Path(_config.show_save_dir) / f'{title}.png'
1086      with _open(path, mode='wb') as f:
1087        f.write(png_data)
1088
1089  def html_from_compressed_images() -> str:
1090    html_strings = []
1091    for image, title, png_data in zip(list_images, list_titles, png_datas):
1092      w, h = _get_width_height(width, height, image.shape[:2])
1093      magnified = h > image.shape[0] or w > image.shape[1]
1094      pixelated2 = pixelated if pixelated is not None else magnified
1095      html_strings.append(
1096          html_from_compressed_image(
1097              png_data, w, h, title=title, border=border, pixelated=pixelated2
1098          )
1099      )
1100    # Create single-row tables each with no more than 'columns' elements.
1101    table_strings = []
1102    for row_html_strings in _chunked(html_strings, columns):
1103      td = '<td style="padding:1px;">'
1104      s = ''.join(f'{td}{e}</td>' for e in row_html_strings)
1105      if ylabel:
1106        style = 'writing-mode:vertical-lr; transform:rotate(180deg);'
1107        s = f'{td}<span style="{style}">{ylabel}</span></td>' + s
1108      table_strings.append(
1109          f'<table class="{html_class}"'
1110          f' style="border-spacing:0px;"><tr>{s}</tr></table>'
1111      )
1112    return ''.join(table_strings)
1113
1114  s = html_from_compressed_images()
1115  while len(s) > _IPYTHON_HTML_SIZE_LIMIT * 0.5:
1116    warnings.warn('mediapy: subsampling images to reduce HTML size')
1117    list_images = [image[::2, ::2] for image in list_images]
1118    png_datas = [compress_image(to_uint8(image)) for image in list_images]
1119    s = html_from_compressed_images()
1120  if return_html:
1121    return s
1122  _display_html(s)
1123  return None
1124
1125
1126def compare_images(
1127    images: Iterable[_ArrayLike],
1128    *,
1129    vmin: float | None = None,
1130    vmax: float | None = None,
1131    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
1132) -> None:
1133  """Compare two images using an interactive slider.
1134
1135  Displays an HTML slider component to interactively swipe between two images.
1136  The slider functionality requires that the web browser have Internet access.
1137  See additional info in `https://github.com/sneas/img-comparison-slider`.
1138
1139  >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
1140  >>> compare_images([image1, image2])
1141
1142  Args:
1143    images: Iterable of images.  Each image must be either a 2D array or a 3D
1144      array with 1, 3, or 4 channels.  There must be exactly two images.
1145    vmin: For single-channel image, explicit min value for display.
1146    vmax: For single-channel image, explicit max value for display.
1147    cmap: For single-channel image, `pyplot` color map or callable to map 1D to
1148      3D color.
1149  """
1150  list_images = [
1151      _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
1152      for image in images
1153  ]
1154  if len(list_images) != 2:
1155    raise ValueError('The number of images must be 2.')
1156  png_datas = [compress_image(to_uint8(image)) for image in list_images]
1157  b64_1, b64_2 = [
1158      base64.b64encode(png_data).decode('utf-8') for png_data in png_datas
1159  ]
1160  s = _IMAGE_COMPARISON_HTML.replace('{b64_1}', b64_1).replace('{b64_2}', b64_2)
1161  _display_html(s)
1162
1163
1164# ** Video I/O.
1165
1166
1167def _filename_suffix_from_codec(codec: str) -> str:
1168  if codec == 'gif':
1169    return '.gif'
1170  if codec == 'vp9':
1171    return '.webm'
1172
1173  return '.mp4'
1174
1175
1176def _get_ffmpeg_path() -> str:
1177  path = _search_for_ffmpeg_path()
1178  if not path:
1179    raise RuntimeError(
1180        f"Program '{_config.ffmpeg_name_or_path}' is not found;"
1181        " perhaps install ffmpeg using 'apt install ffmpeg'."
1182    )
1183  return path
1184
1185
1186@typing.overload
1187def _run_ffmpeg(
1188    ffmpeg_args: Sequence[str],
1189    stdin: int | None = None,
1190    stdout: int | None = None,
1191    stderr: int | None = None,
1192    encoding: None = None,  # No encoding -> bytes
1193    allowed_input_files: Sequence[str] | None = None,
1194    allowed_output_files: Sequence[str] | None = None,
1195    sandbox_max_run_time_secs: int | None = None,
1196) -> subprocess.Popen[bytes]:
1197  ...
1198
1199
1200@typing.overload
1201def _run_ffmpeg(
1202    ffmpeg_args: Sequence[str],
1203    stdin: int | None = None,
1204    stdout: int | None = None,
1205    stderr: int | None = None,
1206    encoding: str = ...,  # Encoding -> str
1207    allowed_input_files: Sequence[str] | None = None,
1208    allowed_output_files: Sequence[str] | None = None,
1209    sandbox_max_run_time_secs: int | None = None,
1210) -> subprocess.Popen[str]:
1211  ...
1212
1213
1214def _run_ffmpeg(
1215    ffmpeg_args: Sequence[str],
1216    stdin: int | None = None,
1217    stdout: int | None = None,
1218    stderr: int | None = None,
1219    encoding: str | None = None,
1220    allowed_input_files: Sequence[str] | None = None,
1221    allowed_output_files: Sequence[str] | None = None,
1222    sandbox_max_run_time_secs: int | None = None,
1223) -> subprocess.Popen[bytes] | subprocess.Popen[str]:
1224  """Runs ffmpeg with the given args.
1225
1226  Args:
1227    ffmpeg_args: The args to pass to ffmpeg.
1228    stdin: Same as in `subprocess.Popen`.
1229    stdout: Same as in `subprocess.Popen`.
1230    stderr: Same as in `subprocess.Popen`.
1231    encoding: Same as in `subprocess.Popen`.
1232    allowed_input_files: The input files to allow for ffmpeg.
1233    allowed_output_files: The output files to allow for ffmpeg.
1234    sandbox_max_run_time_secs: The maximum time in seconds to run the sandbox.
1235      If None, the default limit is 30 minutes.
1236
1237  Returns:
1238    The subprocess.Popen object with running ffmpeg process.
1239  """
1240  argv = []
1241  # In open source, keep env=None to preserve default behavior.
1242  # Context: https://github.com/google/mediapy/pull/62
1243  env: Any = None  # pylint: disable=unused-variable
1244  ffmpeg_path = _get_ffmpeg_path()
1245
1246  # Sandbox max runtime, allowed input and ouput files are not supported in
1247  # open source.
1248  del allowed_input_files
1249  del allowed_output_files
1250  del sandbox_max_run_time_secs
1251
1252  argv.append(ffmpeg_path)
1253  argv.extend(ffmpeg_args)
1254
1255  return subprocess.Popen(
1256      argv,
1257      stdin=stdin,
1258      stdout=stdout,
1259      stderr=stderr,
1260      encoding=encoding,
1261      env=env,
1262  )
1263
1264
1265def video_is_available() -> bool:
1266  """Returns True if the program `ffmpeg` is found.
1267
1268  See also `set_ffmpeg`.
1269  """
1270  return _search_for_ffmpeg_path() is not None
1271
1272
1273class VideoMetadata(NamedTuple):
1274  """Represents the data stored in a video container header.
1275
1276  Attributes:
1277    num_images: Number of frames that is expected from the video stream.  This
1278      is estimated from the framerate and the duration stored in the video
1279      header, so it might be inexact.  We set the value to -1 if number of
1280      frames is not found in the header.
1281    shape: The dimensions (height, width) of each video frame.
1282    fps: The framerate in frames per second.
1283    bps: The estimated bitrate of the video stream in bits per second, retrieved
1284      from the video header.
1285  """
1286
1287  num_images: int
1288  shape: tuple[int, int]
1289  fps: float
1290  bps: int | None
1291
1292
1293def _get_video_metadata(path: _Path) -> VideoMetadata:
1294  """Returns attributes of video stored in the specified local file."""
1295  if not pathlib.Path(path).is_file():
1296    raise RuntimeError(f"Video file '{path}' is not found.")
1297
1298  command = [
1299      '-nostdin',
1300      '-i',
1301      str(path),
1302      '-acodec',
1303      'copy',
1304      # Necessary to get "frame= *(\d+)" using newer ffmpeg versions.
1305      # Previously, was `'-vcodec', 'copy'`
1306      '-vf',
1307      'select=1',
1308      '-vsync',
1309      '0',
1310      '-f',
1311      'null',
1312      '-',
1313  ]
1314  with _run_ffmpeg(
1315      command,
1316      allowed_input_files=[str(path)],
1317      stderr=subprocess.PIPE,
1318      encoding='utf-8',
1319  ) as proc:
1320    _, err = proc.communicate()
1321  bps = fps = num_images = width = height = rotation = None
1322  before_output_info = True
1323  for line in err.split('\n'):
1324    if line.startswith('Output '):
1325      before_output_info = False
1326    if match := re.search(r', bitrate: *([\d.]+) kb/s', line):
1327      bps = int(match.group(1)) * 1000
1328    if matches := re.findall(r'frame= *(\d+) ', line):
1329      num_images = int(matches[-1])
1330    if 'Stream #0:' in line and ': Video:' in line and before_output_info:
1331      if not (match := re.search(r', (\d+)x(\d+)', line)):
1332        raise RuntimeError(f'Unable to parse video dimensions in line {line}')
1333      width, height = int(match.group(1)), int(match.group(2))
1334      if match := re.search(r', ([\d.]+) fps', line):
1335        fps = float(match.group(1))
1336      elif str(path).endswith('.gif'):
1337        # Some GIF files lack a framerate attribute; use a reasonable default.
1338        fps = 10
1339      else:
1340        raise RuntimeError(f'Unable to parse video framerate in line {line}')
1341    if match := re.fullmatch(r'\s*rotate\s*:\s*(\d+)', line):
1342      rotation = int(match.group(1))
1343    if match := re.fullmatch(r'.*rotation of (-?\d+).*\sdegrees', line):
1344      rotation = int(match.group(1))
1345  if not num_images:
1346    num_images = -1
1347  if not width:
1348    raise RuntimeError(f'Unable to parse video header: {err}')
1349  # By default, ffmpeg enables "-autorotate"; we just fix the dimensions.
1350  if rotation in (90, 270, -90, -270):
1351    width, height = height, width
1352  assert height is not None and width is not None
1353  shape = height, width
1354  assert fps is not None
1355  return VideoMetadata(num_images, shape, fps, bps)
1356
1357
1358class _VideoIO:
1359  """Base class for `VideoReader` and `VideoWriter`."""
1360
1361  def _get_pix_fmt(self, dtype: _DType, image_format: str) -> str:
1362    """Returns ffmpeg pix_fmt given data type and image format."""
1363    native_endian_suffix = {'little': 'le', 'big': 'be'}[sys.byteorder]
1364    return {
1365        np.uint8: {
1366            'rgb': 'rgb24',
1367            'yuv': 'yuv444p',
1368            'gray': 'gray',
1369        },
1370        np.uint16: {
1371            'rgb': 'rgb48' + native_endian_suffix,
1372            'yuv': 'yuv444p16' + native_endian_suffix,
1373            'gray': 'gray16' + native_endian_suffix,
1374        },
1375    }[dtype.type][image_format]
1376
1377
1378class VideoReader(_VideoIO):
1379  """Context to read a compressed video as an iterable over its images.
1380
1381  >>> with VideoReader('/tmp/river.mp4') as reader:
1382  ...   print(f'Video has {reader.num_images} images with shape={reader.shape},'
1383  ...         f' at {reader.fps} frames/sec and {reader.bps} bits/sec.')
1384  ...   for image in reader:
1385  ...     print(image.shape)
1386
1387  >>> with VideoReader('/tmp/river.mp4') as reader:
1388  ...   video = np.array(tuple(reader))
1389
1390  >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
1391  >>> with VideoReader(url) as reader:
1392  ...   show_video(reader)
1393
1394  Attributes:
1395    path_or_url: Location of input video.
1396    output_format: Format of output images (default 'rgb').  If 'rgb', each
1397      image has shape=(height, width, 3) with R, G, B values.  If 'yuv', each
1398      image has shape=(height, width, 3) with Y, U, V values.  If 'gray', each
1399      image has shape=(height, width).
1400    dtype: Data type for output images.  The default is `np.uint8`.  Use of
1401      `np.uint16` allows reading 10-bit or 12-bit data without precision loss.
1402    metadata: Object storing the information retrieved from the video header.
1403      Its attributes are copied as attributes in this class.
1404    num_images: Number of frames that is expected from the video stream.  This
1405      is estimated from the framerate and the duration stored in the video
1406      header, so it might be inexact.
1407    shape: The dimensions (height, width) of each video frame.
1408    fps: The framerate in frames per second.
1409    bps: The estimated bitrate of the video stream in bits per second, retrieved
1410      from the video header.
1411    stream_index: The stream index to read from. The default is 0.
1412    sandbox_max_run_time_secs: The maximum time in seconds to run the sandbox.
1413      If None, the default limit is 30 minutes. Unused in open source.
1414  """
1415
1416  path_or_url: _Path
1417  output_format: str
1418  dtype: _DType
1419  metadata: VideoMetadata
1420  num_images: int
1421  shape: tuple[int, int]
1422  fps: float
1423  bps: int | None
1424  stream_index: int
1425  _num_bytes_per_image: int
1426
1427  def __init__(
1428      self,
1429      path_or_url: _Path,
1430      *,
1431      stream_index: int = 0,
1432      output_format: str = 'rgb',
1433      dtype: _DTypeLike = np.uint8,
1434      sandbox_max_run_time_secs: int | None = None,
1435  ):
1436    if output_format not in {'rgb', 'yuv', 'gray'}:
1437      raise ValueError(
1438          f'Output format {output_format} is not rgb, yuv, or gray.'
1439      )
1440    self.path_or_url = path_or_url
1441    self.output_format = output_format
1442    self.stream_index = stream_index
1443    self.dtype = np.dtype(dtype)
1444    if self.dtype.type not in (np.uint8, np.uint16):
1445      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1446    self.sandbox_max_run_time_secs = sandbox_max_run_time_secs
1447    self._read_via_local_file: Any = None
1448    self._popen: subprocess.Popen[bytes] | None = None
1449    self._proc: subprocess.Popen[bytes] | None = None
1450
1451  def __enter__(self) -> 'VideoReader':
1452    try:
1453      self._read_via_local_file = _read_via_local_file(self.path_or_url)
1454      # pylint: disable-next=no-member
1455      tmp_name = self._read_via_local_file.__enter__()
1456
1457      self.metadata = _get_video_metadata(tmp_name)
1458      self.num_images, self.shape, self.fps, self.bps = self.metadata
1459      pix_fmt = self._get_pix_fmt(self.dtype, self.output_format)
1460      num_channels = {'rgb': 3, 'yuv': 3, 'gray': 1}[self.output_format]
1461      bytes_per_channel = self.dtype.itemsize
1462      self._num_bytes_per_image = (
1463          math.prod(self.shape) * num_channels * bytes_per_channel
1464      )
1465
1466      command = [
1467          '-v',
1468          'panic',
1469          '-nostdin',
1470          '-i',
1471          tmp_name,
1472          '-vcodec',
1473          'rawvideo',
1474          '-f',
1475          'image2pipe',
1476          '-map',
1477          f'0:v:{self.stream_index}',
1478          '-pix_fmt',
1479          pix_fmt,
1480          '-vsync',
1481          'vfr',
1482          '-',
1483      ]
1484      self._popen = _run_ffmpeg(
1485          command,
1486          stdout=subprocess.PIPE,
1487          stderr=subprocess.PIPE,
1488          allowed_input_files=[tmp_name],
1489          sandbox_max_run_time_secs=self.sandbox_max_run_time_secs,
1490      )
1491      self._proc = self._popen.__enter__()
1492    except Exception:
1493      self.__exit__(None, None, None)
1494      raise
1495    return self
1496
1497  def __exit__(self, *_: Any) -> None:
1498    self.close()
1499
1500  def read(self) -> _NDArray | None:
1501    """Reads a video image frame (or None if at end of file).
1502
1503    Returns:
1504      A numpy array in the format specified by `output_format`, i.e., a 3D
1505      array with 3 color channels, except for format 'gray' which is 2D.
1506
1507    Raises:
1508      RuntimeError: If there is an error reading from the output file.
1509    """
1510    assert self._proc, 'Error: reading from an already closed context.'
1511    stdout = self._proc.stdout
1512    assert stdout is not None
1513    data = stdout.read(self._num_bytes_per_image)
1514    if not data:  # Due to either end-of-file or subprocess error.
1515      self.close()  # Raises exception if subprocess had error.
1516      return None  # To indicate end-of-file.
1517    if len(data) != self._num_bytes_per_image:
1518      self._proc.wait()
1519      stderr = self._proc.stderr
1520      stderr_output = ''
1521      if stderr is not None:
1522        stderr_output = stderr.read().decode('utf-8', errors='replace').strip()
1523      raise RuntimeError(
1524          f'ffmpeg exited with code {self._proc.returncode}.\nIncomplete'
1525          f' frame read: expected {self._num_bytes_per_image} bytes, but got'
1526          f' {len(data)}.\nffmpeg stderr:\n{stderr_output}'
1527      )
1528    image = np.frombuffer(data, dtype=self.dtype)
1529    if self.output_format == 'rgb':
1530      image = image.reshape(*self.shape, 3)
1531    elif self.output_format == 'yuv':  # Convert from planar YUV to pixel YUV.
1532      image = np.moveaxis(image.reshape(3, *self.shape), 0, 2)
1533    elif self.output_format == 'gray':  # Generate 2D rather than 3D ndimage.
1534      image = image.reshape(*self.shape)
1535    else:
1536      raise AssertionError
1537    return image
1538
1539  def __iter__(self) -> Iterator[_NDArray]:
1540    while True:
1541      image = self.read()
1542      if image is None:
1543        return
1544      yield image
1545
1546  def close(self) -> None:
1547    """Terminates video reader.  (Called automatically at end of context.)"""
1548    if self._popen:
1549      self._popen.__exit__(None, None, None)
1550      self._popen = None
1551      self._proc = None
1552    if self._read_via_local_file:
1553      # pylint: disable-next=no-member
1554      self._read_via_local_file.__exit__(None, None, None)
1555      self._read_via_local_file = None
1556
1557
1558class VideoWriter(_VideoIO):
1559  """Context to write a compressed video.
1560
1561  >>> shape = 480, 640
1562  >>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer:
1563  ...   for image in moving_circle(shape, num_images=60):
1564  ...     writer.add_image(image)
1565  >>> show_video(read_video('/tmp/v.mp4'))
1566
1567
1568  Bitrate control may be specified using at most one of: `bps`, `qp`, or `crf`.
1569  If none are specified, `qp` is set to a default value.
1570  See https://slhck.info/video/2017/03/01/rate-control.html
1571
1572  If codec is 'gif', the args `bps`, `qp`, `crf`, and `encoded_format` are
1573  ignored.
1574
1575  Attributes:
1576    path: Output video.  Its suffix (e.g. '.mp4') determines the video container
1577      format.  The suffix must be '.gif' if the codec is 'gif'.
1578    shape: 2D spatial dimensions (height, width) of video image frames.  The
1579      dimensions must be even if 'encoded_format' has subsampled chroma (e.g.,
1580      'yuv420p' or 'yuv420p10le').
1581    codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264',
1582      'hevc', 'vp9', or 'gif').
1583    metadata: Optional VideoMetadata object whose `fps` and `bps` attributes are
1584      used if not specified as explicit parameters.
1585    fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif').
1586    bps: Requested average bits-per-second bitrate (default None).
1587    qp: Quantization parameter for video compression quality (default None).
1588    crf: Constant rate factor for video compression quality (default None).
1589    ffmpeg_args: Additional arguments for `ffmpeg` command, e.g. '-g 30' to
1590      introduce I-frames, or '-bf 0' to omit B-frames.
1591    input_format: Format of input images (default 'rgb').  If 'rgb', each image
1592      has shape=(height, width, 3) or (height, width).  If 'yuv', each image has
1593      shape=(height, width, 3) with Y, U, V values.  If 'gray', each image has
1594      shape=(height, width).
1595    dtype: Expected data type for input images (any float input images are
1596      converted to `dtype`).  The default is `np.uint8`.  Use of `np.uint16` is
1597      necessary when encoding >8 bits/channel.
1598    encoded_format: Pixel format as defined by `ffmpeg -pix_fmts`, e.g.,
1599      'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma),
1600      'yuv420p10le' (10-bit per channel), etc.  The default (None) selects
1601      'yuv420p' if all shape dimensions are even, else 'yuv444p'.
1602    sandbox_max_run_time_secs: The maximum time in seconds to run the sandbox.
1603      If None, the default limit is 30 minutes. Unused in open source.
1604  """
1605
1606  def __init__(
1607      self,
1608      path: _Path,
1609      shape: tuple[int, int],
1610      *,
1611      codec: str = 'h264',
1612      metadata: VideoMetadata | None = None,
1613      fps: float | None = None,
1614      bps: int | None = None,
1615      qp: int | None = None,
1616      crf: float | None = None,
1617      ffmpeg_args: str | Sequence[str] = '',
1618      input_format: str = 'rgb',
1619      dtype: _DTypeLike = np.uint8,
1620      encoded_format: str | None = None,
1621      sandbox_max_run_time_secs: int | None = None,
1622  ) -> None:
1623    _check_2d_shape(shape)
1624    if fps is None and metadata:
1625      fps = metadata.fps
1626    if fps is None:
1627      fps = 25.0 if codec == 'gif' else 60.0
1628    if fps <= 0.0:
1629      raise ValueError(f'Frame-per-second value {fps} is invalid.')
1630    if bps is None and metadata:
1631      bps = metadata.bps
1632    bps = int(bps) if bps is not None else None
1633    if bps is not None and bps <= 0:
1634      raise ValueError(f'Bitrate value {bps} is invalid.')
1635    if qp is not None and (not isinstance(qp, int) or qp < 0):
1636      raise ValueError(
1637          f'Quantization parameter {qp} cannot be negative. It must be a'
1638          ' non-negative integer.'
1639      )
1640    num_rate_specifications = sum(x is not None for x in (bps, qp, crf))
1641    if num_rate_specifications > 1:
1642      raise ValueError(
1643          f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).'
1644      )
1645    ffmpeg_args = (
1646        shlex.split(ffmpeg_args)
1647        if isinstance(ffmpeg_args, str)
1648        else list(ffmpeg_args)
1649    )
1650    if input_format not in {'rgb', 'yuv', 'gray'}:
1651      raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.')
1652    dtype = np.dtype(dtype)
1653    if dtype.type not in (np.uint8, np.uint16):
1654      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1655    self.path = pathlib.Path(path)
1656    self.shape = shape
1657    all_dimensions_are_even = all(dim % 2 == 0 for dim in shape)
1658    if encoded_format is None:
1659      encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p'
1660    if not all_dimensions_are_even and encoded_format.startswith(
1661        ('yuv42', 'yuvj42')
1662    ):
1663      raise ValueError(
1664          f'With encoded_format {encoded_format}, video dimensions must be'
1665          f' even, but shape is {shape}.'
1666      )
1667    self.fps = fps
1668    self.codec = codec
1669    self.bps = bps
1670    self.qp = qp
1671    self.crf = crf
1672    self.ffmpeg_args = ffmpeg_args
1673    self.input_format = input_format
1674    self.dtype = dtype
1675    self.encoded_format = encoded_format
1676    self.sandbox_max_run_time_secs = sandbox_max_run_time_secs
1677    if num_rate_specifications == 0 and not ffmpeg_args:
1678      qp = 20 if math.prod(self.shape) <= 640 * 480 else 28
1679    self._bitrate_args = (
1680        (['-vb', f'{bps}'] if bps is not None else [])
1681        + (['-qp', f'{qp}'] if qp is not None else [])
1682        + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else [])
1683    )
1684    if self.codec == 'gif':
1685      if self.path.suffix != '.gif':
1686        raise ValueError(f"File '{self.path}' does not have a .gif suffix.")
1687      self.encoded_format = 'pal8'
1688      self._bitrate_args = []
1689      video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse'
1690      # Less common (and likely less useful) is a per-frame color palette:
1691      # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];'
1692      #                 '[s1][p]paletteuse=new=1')
1693      self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args
1694    self._write_via_local_file: Any = None
1695    self._popen: subprocess.Popen[bytes] | None = None
1696    self._proc: subprocess.Popen[bytes] | None = None
1697
1698  def __enter__(self) -> 'VideoWriter':
1699    input_pix_fmt = self._get_pix_fmt(self.dtype, self.input_format)
1700    try:
1701      self._write_via_local_file = _write_via_local_file(self.path)
1702      # pylint: disable-next=no-member
1703      tmp_name = self._write_via_local_file.__enter__()
1704
1705      # Writing to stdout using ('-f', 'mp4', '-') would require
1706      # ('-movflags', 'frag_keyframe+empty_moov') and the result is nonportable.
1707      height, width = self.shape
1708      command = (
1709          [
1710              '-v',
1711              'error',
1712              '-f',
1713              'rawvideo',
1714              '-vcodec',
1715              'rawvideo',
1716              '-pix_fmt',
1717              input_pix_fmt,
1718              '-s',
1719              f'{width}x{height}',
1720              '-r',
1721              f'{self.fps}',
1722              '-i',
1723              '-',
1724              '-an',
1725              '-vcodec',
1726              self.codec,
1727              '-pix_fmt',
1728              self.encoded_format,
1729          ]
1730          + self._bitrate_args
1731          + self.ffmpeg_args
1732          + ['-y', tmp_name]
1733      )
1734      self._popen = _run_ffmpeg(
1735          command,
1736          stdin=subprocess.PIPE,
1737          stderr=subprocess.PIPE,
1738          allowed_output_files=[tmp_name],
1739          sandbox_max_run_time_secs=self.sandbox_max_run_time_secs,
1740      )
1741      self._proc = self._popen.__enter__()
1742    except Exception:
1743      self.__exit__(None, None, None)
1744      raise
1745    return self
1746
1747  def __exit__(self, *_: Any) -> None:
1748    self.close()
1749
1750  def add_image(self, image: _NDArray) -> None:
1751    """Writes a video frame.
1752
1753    Args:
1754      image: Array whose dtype and first two dimensions must match the `dtype`
1755        and `shape` specified in `VideoWriter` initialization.  If
1756        `input_format` is 'gray', the image must be 2D.  For the 'rgb'
1757        input_format, the image may be either 2D (interpreted as grayscale) or
1758        3D with three (R, G, B) channels.  For the 'yuv' input_format, the image
1759        must be 3D with three (Y, U, V) channels.
1760
1761    Raises:
1762      RuntimeError: If there is an error writing to the output file.
1763    """
1764    assert self._proc, 'Error: writing to an already closed context.'
1765    if issubclass(image.dtype.type, (np.floating, np.bool_)):
1766      image = to_type(image, self.dtype)
1767    if image.dtype != self.dtype:
1768      raise ValueError(f'Image type {image.dtype} != {self.dtype}.')
1769    if self.input_format == 'gray':
1770      if image.ndim != 2:
1771        raise ValueError(f'Image dimensions {image.shape} are not 2D.')
1772    else:
1773      if image.ndim == 2 and self.input_format == 'rgb':
1774        image = np.dstack((image, image, image))
1775      if not (image.ndim == 3 and image.shape[2] == 3):
1776        raise ValueError(f'Image dimensions {image.shape} are invalid.')
1777    if image.shape[:2] != self.shape:
1778      raise ValueError(
1779          f'Image dimensions {image.shape[:2]} do not match'
1780          f' those of the initialized video {self.shape}.'
1781      )
1782    if self.input_format == 'yuv':  # Convert from per-pixel YUV to planar YUV.
1783      image = np.moveaxis(image, 2, 0)
1784    data = image.tobytes()
1785    stdin = self._proc.stdin
1786    assert stdin is not None
1787    if stdin.write(data) != len(data):
1788      self._proc.wait()
1789      stderr = self._proc.stderr
1790      assert stderr is not None
1791      s = stderr.read().decode('utf-8')
1792      raise RuntimeError(f"Error writing '{self.path}': {s}")
1793
1794  def close(self) -> None:
1795    """Finishes writing the video.  (Called automatically at end of context.)"""
1796    if self._popen:
1797      assert self._proc, 'Error: closing an already closed context.'
1798      stdin = self._proc.stdin
1799      assert stdin is not None
1800      stdin.close()
1801      if self._proc.wait():
1802        stderr = self._proc.stderr
1803        assert stderr is not None
1804        s = stderr.read().decode('utf-8')
1805        raise RuntimeError(f"Error writing '{self.path}': {s}")
1806      self._popen.__exit__(None, None, None)
1807      self._popen = None
1808      self._proc = None
1809    if self._write_via_local_file:
1810      # pylint: disable-next=no-member
1811      self._write_via_local_file.__exit__(None, None, None)
1812      self._write_via_local_file = None
1813
1814
1815class _VideoArray(npt.NDArray[Any]):
1816  """Wrapper to add a VideoMetadata `metadata` attribute to a numpy array."""
1817
1818  metadata: VideoMetadata | None
1819
1820  def __new__(
1821      cls: Type['_VideoArray'],
1822      input_array: _NDArray,
1823      metadata: VideoMetadata | None = None,
1824  ) -> '_VideoArray':
1825    obj: _VideoArray = np.asarray(input_array).view(cls)
1826    obj.metadata = metadata
1827    return obj
1828
1829  def __array_finalize__(self, obj: Any) -> None:
1830    if obj is None:
1831      return
1832    self.metadata = getattr(obj, 'metadata', None)
1833
1834
1835def read_video(path_or_url: _Path, **kwargs: Any) -> _VideoArray:
1836  """Returns an array containing all images read from a compressed video file.
1837
1838  >>> video = read_video('/tmp/river.mp4')
1839  >>> print(f'The framerate is {video.metadata.fps} frames/s.')
1840  >>> show_video(video)
1841
1842  >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
1843  >>> show_video(read_video(url))
1844
1845  Args:
1846    path_or_url: Input video file.
1847    **kwargs: Additional parameters for `VideoReader`.
1848
1849  Returns:
1850    A 4D `numpy` array with dimensions (frame, height, width, channel), or a 3D
1851    array if `output_format` is specified as 'gray'.  The returned array has an
1852    attribute `metadata` containing `VideoMetadata` information.  This enables
1853    `show_video` to retrieve the framerate in `metadata.fps`.  Note that the
1854    metadata attribute is lost in most subsequent `numpy` operations.
1855  """
1856  with VideoReader(path_or_url, **kwargs) as reader:
1857    return _VideoArray(np.array(tuple(reader)), metadata=reader.metadata)
1858
1859
1860def write_video(path: _Path, images: Iterable[_NDArray], **kwargs: Any) -> None:
1861  """Writes images to a compressed video file.
1862
1863  >>> video = moving_circle((480, 640), num_images=60)
1864  >>> write_video('/tmp/v.mp4', video, fps=60, qp=18)
1865  >>> show_video(read_video('/tmp/v.mp4'))
1866
1867  Args:
1868    path: Output video file.
1869    images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D
1870      arrays.
1871    **kwargs: Additional parameters for `VideoWriter`.
1872  """
1873  first_image, images = _peek_first(images)
1874  shape = first_image.shape[0], first_image.shape[1]
1875  dtype = first_image.dtype
1876  if dtype == bool:
1877    dtype = np.dtype(np.uint8)
1878  elif np.issubdtype(dtype, np.floating):
1879    dtype = np.dtype(np.uint16)
1880  kwargs = {'metadata': getattr(images, 'metadata', None), **kwargs}
1881  with VideoWriter(path, shape=shape, dtype=dtype, **kwargs) as writer:
1882    for image in images:
1883      writer.add_image(image)
1884
1885
1886def compress_video(
1887    images: Iterable[_NDArray], *, codec: str = 'h264', **kwargs: Any
1888) -> bytes:
1889  """Returns a buffer containing a compressed video.
1890
1891  The video container is 'gif' for 'gif' codec, 'webm' for 'vp9' codec,
1892  and mp4 otherwise.
1893
1894  >>> video = read_video('/tmp/river.mp4')
1895  >>> data = compress_video(video, bps=10_000_000)
1896  >>> print(len(data))
1897
1898  >>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10)
1899
1900  Args:
1901    images: Iterable over video frames.
1902    codec: Compression algorithm as defined by `ffmpeg -codecs` (e.g., 'h264',
1903      'hevc', 'vp9', or 'gif').
1904    **kwargs: Additional parameters for `VideoWriter`.
1905
1906  Returns:
1907    A bytes buffer containing the compressed video.
1908  """
1909  suffix = _filename_suffix_from_codec(codec)
1910  with tempfile.TemporaryDirectory() as directory_name:
1911    tmp_path = pathlib.Path(directory_name) / f'file{suffix}'
1912    write_video(tmp_path, images, codec=codec, **kwargs)
1913    return tmp_path.read_bytes()
1914
1915
1916def decompress_video(data: bytes, **kwargs: Any) -> _NDArray:
1917  """Returns video images from an MP4-compressed data buffer."""
1918  with tempfile.TemporaryDirectory() as directory_name:
1919    tmp_path = pathlib.Path(directory_name) / 'file.mp4'
1920    tmp_path.write_bytes(data)
1921    return read_video(tmp_path, **kwargs)
1922
1923
1924def html_from_compressed_video(
1925    data: bytes,
1926    width: int,
1927    height: int,
1928    *,
1929    title: str | None = None,
1930    border: bool | str = False,
1931    loop: bool = True,
1932    autoplay: bool = True,
1933) -> str:
1934  """Returns an HTML string with a video tag containing H264-encoded data.
1935
1936  Args:
1937    data: MP4-compressed video bytes.
1938    width: Width of HTML video in pixels.
1939    height: Height of HTML video in pixels.
1940    title: Optional text shown centered above the video.
1941    border: If `bool`, whether to place a black boundary around the image, or if
1942      `str`, the boundary CSS style.
1943    loop: If True, the playback repeats forever.
1944    autoplay: If True, video playback starts without having to click.
1945  """
1946  b64 = base64.b64encode(data).decode('utf-8')
1947  if isinstance(border, str):
1948    border = f'{border}; '
1949  elif border:
1950    border = 'border:1px solid black; '
1951  else:
1952    border = ''
1953  options = (
1954      f'controls width="{width}" height="{height}"'
1955      f' style="{border}object-fit:cover;"'
1956      f'{" loop" if loop else ""}'
1957      f'{" autoplay muted" if autoplay else ""}'
1958  )
1959  s = f"""<video {options}>
1960      <source src="data:video/mp4;base64,{b64}" type="video/mp4"/>
1961      This browser does not support the video tag.
1962      </video>"""
1963  if title is not None:
1964    s = f"""<div style="display:flex; align-items:left;">
1965      <div style="display:flex; flex-direction:column; align-items:center;">
1966      <div>{title}</div><div>{s}</div></div></div>"""
1967  return s
1968
1969
1970def show_video(
1971    images: Iterable[_NDArray], *, title: str | None = None, **kwargs: Any
1972) -> str | None:
1973  """Displays a video in the IPython notebook and optionally saves it to a file.
1974
1975  See `show_videos`.
1976
1977  >>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4')
1978  >>> show_video(video, title='River video')
1979
1980  >>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True)
1981
1982  >>> show_video(read_video('/tmp/river.mp4'))
1983
1984  Args:
1985    images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D
1986      arrays).
1987    title: Optional text shown centered above the video.
1988    **kwargs: See `show_videos`.
1989
1990  Returns:
1991    html string if `return_html` is `True`.
1992  """
1993  return show_videos([images], [title], **kwargs)
1994
1995
1996def show_videos(
1997    videos: Iterable[Iterable[_NDArray]] | Mapping[str, Iterable[_NDArray]],
1998    titles: Iterable[str | None] | None = None,
1999    *,
2000    width: int | None = None,
2001    height: int | None = None,
2002    downsample: bool = True,
2003    columns: int | None = None,
2004    fps: float | None = None,
2005    bps: int | None = None,
2006    qp: int | None = None,
2007    codec: str = 'h264',
2008    ylabel: str = '',
2009    html_class: str = 'show_videos',
2010    return_html: bool = False,
2011    **kwargs: Any,
2012) -> str | None:
2013  """Displays a row of videos in the IPython notebook.
2014
2015  Creates HTML with `<video>` tags containing embedded H264-encoded bytestrings.
2016  If `codec` is set to 'gif', we instead use `<img>` tags containing embedded
2017  GIF-encoded bytestrings.  Note that the resulting GIF animations skip frames
2018  when the `fps` period is not a multiple of 10 ms units (GIF frame delay
2019  units).  Encoding at `fps` = 20.0, 25.0, or 50.0 works fine.
2020
2021  If a directory has been specified using `set_show_save_dir`, also saves each
2022  titled video to a file in that directory based on its title.
2023
2024  Args:
2025    videos: Iterable of videos, or dictionary of `{title: video}`.  Each video
2026      must be an iterable of images.  If a video object has a `metadata`
2027      (`VideoMetadata`) attribute, its `fps` field provides a default framerate.
2028    titles: Optional strings shown above the corresponding videos.
2029    width: Optional, overrides displayed width (in pixels).
2030    height: Optional, overrides displayed height (in pixels).
2031    downsample: If True, each video whose width or height is greater than the
2032      specified `width` or `height` is resampled to the display resolution. This
2033      improves antialiasing and reduces the size of the notebook.
2034    columns: Optional, maximum number of videos per row.
2035    fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF).
2036    bps: Bits-per-second bitrate (default None).
2037    qp: Quantization parameter for video compression quality (default None).
2038    codec: Compression algorithm; must be either 'h264' or 'gif'.
2039    ylabel: Text (rotated by 90 degrees) shown on the left of each row.
2040    html_class: CSS class name used in definition of HTML element.
2041    return_html: If `True` return the raw HTML `str` instead of displaying.
2042    **kwargs: Additional parameters (`border`, `loop`, `autoplay`) for
2043      `html_from_compressed_video`.
2044
2045  Returns:
2046    html string if `return_html` is `True`.
2047  """
2048  if isinstance(videos, Mapping):
2049    if titles is not None:
2050      raise ValueError(
2051          'Cannot have both a video dictionary and a titles parameter.'
2052      )
2053    list_titles = list(videos.keys())
2054    list_videos = list(videos.values())
2055  else:
2056    list_videos = list(cast('Iterable[_NDArray]', videos))
2057    list_titles = [None] * len(list_videos) if titles is None else list(titles)
2058    if len(list_videos) != len(list_titles):
2059      raise ValueError(
2060          'Number of videos does not match number of titles'
2061          f' ({len(list_videos)} vs {len(list_titles)}).'
2062      )
2063  if codec not in {'h264', 'gif'}:
2064    raise ValueError(f'Codec {codec} is neither h264 or gif.')
2065
2066  html_strings = []
2067  for video, title in zip(list_videos, list_titles):
2068    metadata: VideoMetadata | None = getattr(video, 'metadata', None)
2069    first_image, video = _peek_first(video)
2070    w, h = _get_width_height(width, height, first_image.shape[:2])
2071    if downsample and (w < first_image.shape[1] or h < first_image.shape[0]):
2072      # Not resize_video() because each image may have different depth and type.
2073      video = [resize_image(image, (h, w)) for image in video]
2074      first_image = video[0]
2075    data = compress_video(
2076        video, metadata=metadata, fps=fps, bps=bps, qp=qp, codec=codec
2077    )
2078    if title is not None and _config.show_save_dir:
2079      suffix = _filename_suffix_from_codec(codec)
2080      path = pathlib.Path(_config.show_save_dir) / f'{title}{suffix}'
2081      with _open(path, mode='wb') as f:
2082        f.write(data)
2083    if codec == 'gif':
2084      pixelated = h > first_image.shape[0] or w > first_image.shape[1]
2085      html_string = html_from_compressed_image(
2086          data, w, h, title=title, fmt='gif', pixelated=pixelated, **kwargs
2087      )
2088    else:
2089      html_string = html_from_compressed_video(
2090          data, w, h, title=title, **kwargs
2091      )
2092    html_strings.append(html_string)
2093
2094  # Create single-row tables each with no more than 'columns' elements.
2095  table_strings = []
2096  for row_html_strings in _chunked(html_strings, columns):
2097    td = '<td style="padding:1px;">'
2098    s = ''.join(f'{td}{e}</td>' for e in row_html_strings)
2099    if ylabel:
2100      style = 'writing-mode:vertical-lr; transform:rotate(180deg);'
2101      s = f'{td}<span style="{style}">{ylabel}</span></td>' + s
2102    table_strings.append(
2103        f'<table class="{html_class}"'
2104        f' style="border-spacing:0px;"><tr>{s}</tr></table>'
2105    )
2106  s = ''.join(table_strings)
2107  if return_html:
2108    return s
2109  _display_html(s)
2110  return None
2111
2112
2113# Local Variables:
2114# fill-column: 80
2115# End:
def show_image( image: ArrayLike, *, title: str | None = None, **kwargs: Any) -> str | None:
977def show_image(
978    image: _ArrayLike, *, title: str | None = None, **kwargs: Any
979) -> str | None:
980  """Displays an image in the notebook and optionally saves it to a file.
981
982  See `show_images`.
983
984  >>> show_image(np.random.rand(100, 100))
985  >>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8'))
986  >>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100)
987  >>> show_image(read_image('/tmp/image.png'))
988  >>> url = 'https://github.com/hhoppe/data/raw/main/image.png'
989  >>> show_image(read_image(url))
990
991  Args:
992    image: 2D array-like, or 3D array-like with 1, 3, or 4 channels.
993    title: Optional text shown centered above the image.
994    **kwargs: See `show_images`.
995
996  Returns:
997    html string if `return_html` is `True`.
998  """
999  return show_images([np.asarray(image)], [title], **kwargs)

Displays an image in the notebook and optionally saves it to a file.

See show_images.

>>> show_image(np.random.rand(100, 100))
>>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8'))
>>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100)
>>> show_image(read_image('/tmp/image.png'))
>>> url = 'https://github.com/hhoppe/data/raw/main/image.png'
>>> show_image(read_image(url))
Arguments:
  • image: 2D array-like, or 3D array-like with 1, 3, or 4 channels.
  • title: Optional text shown centered above the image.
  • **kwargs: See show_images.
Returns:

html string if return_html is True.

def show_images( images: Iterable[ArrayLike] | Mapping[str, ArrayLike], titles: Iterable[str | None] | None = None, *, width: int | None = None, height: int | None = None, downsample: bool = True, columns: int | None = None, vmin: float | None = None, vmax: float | None = None, cmap: str | Callable[[ArrayLike], np.ndarray] = 'gray', border: bool | str = False, ylabel: str = '', html_class: str = 'show_images', pixelated: bool | None = None, return_html: bool = False) -> str | None:
1002def show_images(
1003    images: Iterable[_ArrayLike] | Mapping[str, _ArrayLike],
1004    titles: Iterable[str | None] | None = None,
1005    *,
1006    width: int | None = None,
1007    height: int | None = None,
1008    downsample: bool = True,
1009    columns: int | None = None,
1010    vmin: float | None = None,
1011    vmax: float | None = None,
1012    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
1013    border: bool | str = False,
1014    ylabel: str = '',
1015    html_class: str = 'show_images',
1016    pixelated: bool | None = None,
1017    return_html: bool = False,
1018) -> str | None:
1019  """Displays a row of images in the IPython/Jupyter notebook.
1020
1021  If a directory has been specified using `set_show_save_dir`, also saves each
1022  titled image to a file in that directory based on its title.
1023
1024  >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
1025  >>> show_images([image1, image2])
1026  >>> show_images({'random image': image1, 'color ramp': image2}, height=128)
1027  >>> show_images([image1, image2] * 5, columns=4, border=True)
1028
1029  Args:
1030    images: Iterable of images, or dictionary of `{title: image}`.  Each image
1031      must be either a 2D array or a 3D array with 1, 3, or 4 channels.
1032    titles: Optional strings shown above the corresponding images.
1033    width: Optional, overrides displayed width (in pixels).
1034    height: Optional, overrides displayed height (in pixels).
1035    downsample: If True, each image whose width or height is greater than the
1036      specified `width` or `height` is resampled to the display resolution. This
1037      improves antialiasing and reduces the size of the notebook.
1038    columns: Optional, maximum number of images per row.
1039    vmin: For single-channel image, explicit min value for display.
1040    vmax: For single-channel image, explicit max value for display.
1041    cmap: For single-channel image, `pyplot` color map or callable to map 1D to
1042      3D color.
1043    border: If `bool`, whether to place a black boundary around the image, or if
1044      `str`, the boundary CSS style.
1045    ylabel: Text (rotated by 90 degrees) shown on the left of each row.
1046    html_class: CSS class name used in definition of HTML element.
1047    pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if
1048      False, sets 'image-rendering: auto'; if None, uses pixelated rendering
1049      only on images for which `width` or `height` introduces magnification.
1050    return_html: If `True` return the raw HTML `str` instead of displaying.
1051
1052  Returns:
1053    html string if `return_html` is `True`.
1054  """
1055  if isinstance(images, Mapping):
1056    if titles is not None:
1057      raise ValueError('Cannot have images dictionary and titles parameter.')
1058    list_titles, list_images = list(images.keys()), list(images.values())
1059  else:
1060    list_images = list(images)
1061    list_titles = [None] * len(list_images) if titles is None else list(titles)
1062    if len(list_images) != len(list_titles):
1063      raise ValueError(
1064          'Number of images does not match number of titles'
1065          f' ({len(list_images)} vs {len(list_titles)}).'
1066      )
1067
1068  list_images = [
1069      _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
1070      for image in list_images
1071  ]
1072
1073  def maybe_downsample(image: _NDArray) -> _NDArray:
1074    shape = image.shape[0], image.shape[1]
1075    w, h = _get_width_height(width, height, shape)
1076    if w < shape[1] or h < shape[0]:
1077      image = resize_image(image, (h, w))
1078    return image
1079
1080  if downsample:
1081    list_images = [maybe_downsample(image) for image in list_images]
1082  png_datas = [compress_image(to_uint8(image)) for image in list_images]
1083
1084  for title, png_data in zip(list_titles, png_datas):
1085    if title is not None and _config.show_save_dir:
1086      path = pathlib.Path(_config.show_save_dir) / f'{title}.png'
1087      with _open(path, mode='wb') as f:
1088        f.write(png_data)
1089
1090  def html_from_compressed_images() -> str:
1091    html_strings = []
1092    for image, title, png_data in zip(list_images, list_titles, png_datas):
1093      w, h = _get_width_height(width, height, image.shape[:2])
1094      magnified = h > image.shape[0] or w > image.shape[1]
1095      pixelated2 = pixelated if pixelated is not None else magnified
1096      html_strings.append(
1097          html_from_compressed_image(
1098              png_data, w, h, title=title, border=border, pixelated=pixelated2
1099          )
1100      )
1101    # Create single-row tables each with no more than 'columns' elements.
1102    table_strings = []
1103    for row_html_strings in _chunked(html_strings, columns):
1104      td = '<td style="padding:1px;">'
1105      s = ''.join(f'{td}{e}</td>' for e in row_html_strings)
1106      if ylabel:
1107        style = 'writing-mode:vertical-lr; transform:rotate(180deg);'
1108        s = f'{td}<span style="{style}">{ylabel}</span></td>' + s
1109      table_strings.append(
1110          f'<table class="{html_class}"'
1111          f' style="border-spacing:0px;"><tr>{s}</tr></table>'
1112      )
1113    return ''.join(table_strings)
1114
1115  s = html_from_compressed_images()
1116  while len(s) > _IPYTHON_HTML_SIZE_LIMIT * 0.5:
1117    warnings.warn('mediapy: subsampling images to reduce HTML size')
1118    list_images = [image[::2, ::2] for image in list_images]
1119    png_datas = [compress_image(to_uint8(image)) for image in list_images]
1120    s = html_from_compressed_images()
1121  if return_html:
1122    return s
1123  _display_html(s)
1124  return None

Displays a row of images in the IPython/Jupyter notebook.

If a directory has been specified using set_show_save_dir, also saves each titled image to a file in that directory based on its title.

>>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
>>> show_images([image1, image2])
>>> show_images({'random image': image1, 'color ramp': image2}, height=128)
>>> show_images([image1, image2] * 5, columns=4, border=True)
Arguments:
  • images: Iterable of images, or dictionary of {title: image}. Each image must be either a 2D array or a 3D array with 1, 3, or 4 channels.
  • titles: Optional strings shown above the corresponding images.
  • width: Optional, overrides displayed width (in pixels).
  • height: Optional, overrides displayed height (in pixels).
  • downsample: If True, each image whose width or height is greater than the specified width or height is resampled to the display resolution. This improves antialiasing and reduces the size of the notebook.
  • columns: Optional, maximum number of images per row.
  • vmin: For single-channel image, explicit min value for display.
  • vmax: For single-channel image, explicit max value for display.
  • cmap: For single-channel image, pyplot color map or callable to map 1D to 3D color.
  • border: If bool, whether to place a black boundary around the image, or if str, the boundary CSS style.
  • ylabel: Text (rotated by 90 degrees) shown on the left of each row.
  • html_class: CSS class name used in definition of HTML element.
  • pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if False, sets 'image-rendering: auto'; if None, uses pixelated rendering only on images for which width or height introduces magnification.
  • return_html: If True return the raw HTML str instead of displaying.
Returns:

html string if return_html is True.

def compare_images( images: Iterable[ArrayLike], *, vmin: float | None = None, vmax: float | None = None, cmap: str | Callable[[ArrayLike], np.ndarray] = 'gray') -> None:
1127def compare_images(
1128    images: Iterable[_ArrayLike],
1129    *,
1130    vmin: float | None = None,
1131    vmax: float | None = None,
1132    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
1133) -> None:
1134  """Compare two images using an interactive slider.
1135
1136  Displays an HTML slider component to interactively swipe between two images.
1137  The slider functionality requires that the web browser have Internet access.
1138  See additional info in `https://github.com/sneas/img-comparison-slider`.
1139
1140  >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
1141  >>> compare_images([image1, image2])
1142
1143  Args:
1144    images: Iterable of images.  Each image must be either a 2D array or a 3D
1145      array with 1, 3, or 4 channels.  There must be exactly two images.
1146    vmin: For single-channel image, explicit min value for display.
1147    vmax: For single-channel image, explicit max value for display.
1148    cmap: For single-channel image, `pyplot` color map or callable to map 1D to
1149      3D color.
1150  """
1151  list_images = [
1152      _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
1153      for image in images
1154  ]
1155  if len(list_images) != 2:
1156    raise ValueError('The number of images must be 2.')
1157  png_datas = [compress_image(to_uint8(image)) for image in list_images]
1158  b64_1, b64_2 = [
1159      base64.b64encode(png_data).decode('utf-8') for png_data in png_datas
1160  ]
1161  s = _IMAGE_COMPARISON_HTML.replace('{b64_1}', b64_1).replace('{b64_2}', b64_2)
1162  _display_html(s)

Compare two images using an interactive slider.

Displays an HTML slider component to interactively swipe between two images. The slider functionality requires that the web browser have Internet access. See additional info in https://github.com/sneas/img-comparison-slider.

>>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
>>> compare_images([image1, image2])
Arguments:
  • images: Iterable of images. Each image must be either a 2D array or a 3D array with 1, 3, or 4 channels. There must be exactly two images.
  • vmin: For single-channel image, explicit min value for display.
  • vmax: For single-channel image, explicit max value for display.
  • cmap: For single-channel image, pyplot color map or callable to map 1D to 3D color.
def show_video( images: Iterable[np.ndarray], *, title: str | None = None, **kwargs: Any) -> str | None:
1971def show_video(
1972    images: Iterable[_NDArray], *, title: str | None = None, **kwargs: Any
1973) -> str | None:
1974  """Displays a video in the IPython notebook and optionally saves it to a file.
1975
1976  See `show_videos`.
1977
1978  >>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4')
1979  >>> show_video(video, title='River video')
1980
1981  >>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True)
1982
1983  >>> show_video(read_video('/tmp/river.mp4'))
1984
1985  Args:
1986    images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D
1987      arrays).
1988    title: Optional text shown centered above the video.
1989    **kwargs: See `show_videos`.
1990
1991  Returns:
1992    html string if `return_html` is `True`.
1993  """
1994  return show_videos([images], [title], **kwargs)

Displays a video in the IPython notebook and optionally saves it to a file.

See show_videos.

>>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4')
>>> show_video(video, title='River video')
>>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True)
>>> show_video(read_video('/tmp/river.mp4'))
Arguments:
  • images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D arrays).
  • title: Optional text shown centered above the video.
  • **kwargs: See show_videos.
Returns:

html string if return_html is True.

def show_videos( videos: Iterable[Iterable[np.ndarray]] | Mapping[str, Iterable[np.ndarray]], titles: Iterable[str | None] | None = None, *, width: int | None = None, height: int | None = None, downsample: bool = True, columns: int | None = None, fps: float | None = None, bps: int | None = None, qp: int | None = None, codec: str = 'h264', ylabel: str = '', html_class: str = 'show_videos', return_html: bool = False, **kwargs: Any) -> str | None:
1997def show_videos(
1998    videos: Iterable[Iterable[_NDArray]] | Mapping[str, Iterable[_NDArray]],
1999    titles: Iterable[str | None] | None = None,
2000    *,
2001    width: int | None = None,
2002    height: int | None = None,
2003    downsample: bool = True,
2004    columns: int | None = None,
2005    fps: float | None = None,
2006    bps: int | None = None,
2007    qp: int | None = None,
2008    codec: str = 'h264',
2009    ylabel: str = '',
2010    html_class: str = 'show_videos',
2011    return_html: bool = False,
2012    **kwargs: Any,
2013) -> str | None:
2014  """Displays a row of videos in the IPython notebook.
2015
2016  Creates HTML with `<video>` tags containing embedded H264-encoded bytestrings.
2017  If `codec` is set to 'gif', we instead use `<img>` tags containing embedded
2018  GIF-encoded bytestrings.  Note that the resulting GIF animations skip frames
2019  when the `fps` period is not a multiple of 10 ms units (GIF frame delay
2020  units).  Encoding at `fps` = 20.0, 25.0, or 50.0 works fine.
2021
2022  If a directory has been specified using `set_show_save_dir`, also saves each
2023  titled video to a file in that directory based on its title.
2024
2025  Args:
2026    videos: Iterable of videos, or dictionary of `{title: video}`.  Each video
2027      must be an iterable of images.  If a video object has a `metadata`
2028      (`VideoMetadata`) attribute, its `fps` field provides a default framerate.
2029    titles: Optional strings shown above the corresponding videos.
2030    width: Optional, overrides displayed width (in pixels).
2031    height: Optional, overrides displayed height (in pixels).
2032    downsample: If True, each video whose width or height is greater than the
2033      specified `width` or `height` is resampled to the display resolution. This
2034      improves antialiasing and reduces the size of the notebook.
2035    columns: Optional, maximum number of videos per row.
2036    fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF).
2037    bps: Bits-per-second bitrate (default None).
2038    qp: Quantization parameter for video compression quality (default None).
2039    codec: Compression algorithm; must be either 'h264' or 'gif'.
2040    ylabel: Text (rotated by 90 degrees) shown on the left of each row.
2041    html_class: CSS class name used in definition of HTML element.
2042    return_html: If `True` return the raw HTML `str` instead of displaying.
2043    **kwargs: Additional parameters (`border`, `loop`, `autoplay`) for
2044      `html_from_compressed_video`.
2045
2046  Returns:
2047    html string if `return_html` is `True`.
2048  """
2049  if isinstance(videos, Mapping):
2050    if titles is not None:
2051      raise ValueError(
2052          'Cannot have both a video dictionary and a titles parameter.'
2053      )
2054    list_titles = list(videos.keys())
2055    list_videos = list(videos.values())
2056  else:
2057    list_videos = list(cast('Iterable[_NDArray]', videos))
2058    list_titles = [None] * len(list_videos) if titles is None else list(titles)
2059    if len(list_videos) != len(list_titles):
2060      raise ValueError(
2061          'Number of videos does not match number of titles'
2062          f' ({len(list_videos)} vs {len(list_titles)}).'
2063      )
2064  if codec not in {'h264', 'gif'}:
2065    raise ValueError(f'Codec {codec} is neither h264 or gif.')
2066
2067  html_strings = []
2068  for video, title in zip(list_videos, list_titles):
2069    metadata: VideoMetadata | None = getattr(video, 'metadata', None)
2070    first_image, video = _peek_first(video)
2071    w, h = _get_width_height(width, height, first_image.shape[:2])
2072    if downsample and (w < first_image.shape[1] or h < first_image.shape[0]):
2073      # Not resize_video() because each image may have different depth and type.
2074      video = [resize_image(image, (h, w)) for image in video]
2075      first_image = video[0]
2076    data = compress_video(
2077        video, metadata=metadata, fps=fps, bps=bps, qp=qp, codec=codec
2078    )
2079    if title is not None and _config.show_save_dir:
2080      suffix = _filename_suffix_from_codec(codec)
2081      path = pathlib.Path(_config.show_save_dir) / f'{title}{suffix}'
2082      with _open(path, mode='wb') as f:
2083        f.write(data)
2084    if codec == 'gif':
2085      pixelated = h > first_image.shape[0] or w > first_image.shape[1]
2086      html_string = html_from_compressed_image(
2087          data, w, h, title=title, fmt='gif', pixelated=pixelated, **kwargs
2088      )
2089    else:
2090      html_string = html_from_compressed_video(
2091          data, w, h, title=title, **kwargs
2092      )
2093    html_strings.append(html_string)
2094
2095  # Create single-row tables each with no more than 'columns' elements.
2096  table_strings = []
2097  for row_html_strings in _chunked(html_strings, columns):
2098    td = '<td style="padding:1px;">'
2099    s = ''.join(f'{td}{e}</td>' for e in row_html_strings)
2100    if ylabel:
2101      style = 'writing-mode:vertical-lr; transform:rotate(180deg);'
2102      s = f'{td}<span style="{style}">{ylabel}</span></td>' + s
2103    table_strings.append(
2104        f'<table class="{html_class}"'
2105        f' style="border-spacing:0px;"><tr>{s}</tr></table>'
2106    )
2107  s = ''.join(table_strings)
2108  if return_html:
2109    return s
2110  _display_html(s)
2111  return None

Displays a row of videos in the IPython notebook.

Creates HTML with <video> tags containing embedded H264-encoded bytestrings. If codec is set to 'gif', we instead use <img> tags containing embedded GIF-encoded bytestrings. Note that the resulting GIF animations skip frames when the fps period is not a multiple of 10 ms units (GIF frame delay units). Encoding at fps = 20.0, 25.0, or 50.0 works fine.

If a directory has been specified using set_show_save_dir, also saves each titled video to a file in that directory based on its title.

Arguments:
  • videos: Iterable of videos, or dictionary of {title: video}. Each video must be an iterable of images. If a video object has a metadata (VideoMetadata) attribute, its fps field provides a default framerate.
  • titles: Optional strings shown above the corresponding videos.
  • width: Optional, overrides displayed width (in pixels).
  • height: Optional, overrides displayed height (in pixels).
  • downsample: If True, each video whose width or height is greater than the specified width or height is resampled to the display resolution. This improves antialiasing and reduces the size of the notebook.
  • columns: Optional, maximum number of videos per row.
  • fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF).
  • bps: Bits-per-second bitrate (default None).
  • qp: Quantization parameter for video compression quality (default None).
  • codec: Compression algorithm; must be either 'h264' or 'gif'.
  • ylabel: Text (rotated by 90 degrees) shown on the left of each row.
  • html_class: CSS class name used in definition of HTML element.
  • return_html: If True return the raw HTML str instead of displaying.
  • **kwargs: Additional parameters (border, loop, autoplay) for html_from_compressed_video.
Returns:

html string if return_html is True.

def read_image( path_or_url: str | os.PathLike[str], *, apply_exif_transpose: bool = True, dtype: DTypeLike = None) -> np.ndarray:
769def read_image(
770    path_or_url: _Path,
771    *,
772    apply_exif_transpose: bool = True,
773    dtype: _DTypeLike = None,
774) -> _NDArray:
775  """Returns an image read from a file path or URL.
776
777  Decoding is performed using `PIL`, which supports `uint8` images with 1, 3,
778  or 4 channels and `uint16` images with a single channel.
779
780  Args:
781    path_or_url: Path of input file.
782    apply_exif_transpose: If True, rotate image according to EXIF orientation.
783    dtype: Data type of the returned array.  If None, `np.uint8` or `np.uint16`
784      is inferred automatically.
785  """
786  data = read_contents(path_or_url)
787  return decompress_image(data, dtype, apply_exif_transpose)

Returns an image read from a file path or URL.

Decoding is performed using PIL, which supports uint8 images with 1, 3, or 4 channels and uint16 images with a single channel.

Arguments:
  • path_or_url: Path of input file.
  • apply_exif_transpose: If True, rotate image according to EXIF orientation.
  • dtype: Data type of the returned array. If None, np.uint8 or np.uint16 is inferred automatically.
def write_image( path: str | os.PathLike[str], image: ArrayLike, fmt: str = 'png', **kwargs: Any) -> None:
790def write_image(
791    path: _Path, image: _ArrayLike, fmt: str = 'png', **kwargs: Any
792) -> None:
793  """Writes an image to a file.
794
795  Encoding is performed using `PIL`, which supports `uint8` images with 1, 3,
796  or 4 channels and `uint16` images with a single channel.
797
798  File format is explicitly provided by `fmt` and not inferred by `path`.
799
800  Args:
801    path: Path of output file.
802    image: Array-like object.  If its type is float, it is converted to np.uint8
803      using `to_uint8` (thus clamping to the input to the range [0.0, 1.0]).
804      Otherwise it must be np.uint8 or np.uint16.
805    fmt: Desired compression encoding, e.g. 'png'.
806    **kwargs: Additional parameters for `PIL.Image.save()`.
807  """
808  image = _as_valid_media_array(image)
809  if np.issubdtype(image.dtype, np.floating):
810    image = to_uint8(image)
811  with _open(path, 'wb') as f:
812    _pil_image(image).save(f, format=fmt, **kwargs)

Writes an image to a file.

Encoding is performed using PIL, which supports uint8 images with 1, 3, or 4 channels and uint16 images with a single channel.

File format is explicitly provided by fmt and not inferred by path.

Arguments:
  • path: Path of output file.
  • image: Array-like object. If its type is float, it is converted to np.uint8 using to_uint8 (thus clamping to the input to the range [0.0, 1.0]). Otherwise it must be np.uint8 or np.uint16.
  • fmt: Desired compression encoding, e.g. 'png'.
  • **kwargs: Additional parameters for PIL.Image.save().
def read_video( path_or_url: str | os.PathLike[str], **kwargs: Any) -> mediapy._VideoArray:
1836def read_video(path_or_url: _Path, **kwargs: Any) -> _VideoArray:
1837  """Returns an array containing all images read from a compressed video file.
1838
1839  >>> video = read_video('/tmp/river.mp4')
1840  >>> print(f'The framerate is {video.metadata.fps} frames/s.')
1841  >>> show_video(video)
1842
1843  >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
1844  >>> show_video(read_video(url))
1845
1846  Args:
1847    path_or_url: Input video file.
1848    **kwargs: Additional parameters for `VideoReader`.
1849
1850  Returns:
1851    A 4D `numpy` array with dimensions (frame, height, width, channel), or a 3D
1852    array if `output_format` is specified as 'gray'.  The returned array has an
1853    attribute `metadata` containing `VideoMetadata` information.  This enables
1854    `show_video` to retrieve the framerate in `metadata.fps`.  Note that the
1855    metadata attribute is lost in most subsequent `numpy` operations.
1856  """
1857  with VideoReader(path_or_url, **kwargs) as reader:
1858    return _VideoArray(np.array(tuple(reader)), metadata=reader.metadata)

Returns an array containing all images read from a compressed video file.

>>> video = read_video('/tmp/river.mp4')
>>> print(f'The framerate is {video.metadata.fps} frames/s.')
>>> show_video(video)
>>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
>>> show_video(read_video(url))
Arguments:
  • path_or_url: Input video file.
  • **kwargs: Additional parameters for VideoReader.
Returns:

A 4D numpy array with dimensions (frame, height, width, channel), or a 3D array if output_format is specified as 'gray'. The returned array has an attribute metadata containing VideoMetadata information. This enables show_video to retrieve the framerate in metadata.fps. Note that the metadata attribute is lost in most subsequent numpy operations.

def write_video( path: str | os.PathLike[str], images: Iterable[np.ndarray], **kwargs: Any) -> None:
1861def write_video(path: _Path, images: Iterable[_NDArray], **kwargs: Any) -> None:
1862  """Writes images to a compressed video file.
1863
1864  >>> video = moving_circle((480, 640), num_images=60)
1865  >>> write_video('/tmp/v.mp4', video, fps=60, qp=18)
1866  >>> show_video(read_video('/tmp/v.mp4'))
1867
1868  Args:
1869    path: Output video file.
1870    images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D
1871      arrays.
1872    **kwargs: Additional parameters for `VideoWriter`.
1873  """
1874  first_image, images = _peek_first(images)
1875  shape = first_image.shape[0], first_image.shape[1]
1876  dtype = first_image.dtype
1877  if dtype == bool:
1878    dtype = np.dtype(np.uint8)
1879  elif np.issubdtype(dtype, np.floating):
1880    dtype = np.dtype(np.uint16)
1881  kwargs = {'metadata': getattr(images, 'metadata', None), **kwargs}
1882  with VideoWriter(path, shape=shape, dtype=dtype, **kwargs) as writer:
1883    for image in images:
1884      writer.add_image(image)

Writes images to a compressed video file.

>>> video = moving_circle((480, 640), num_images=60)
>>> write_video('/tmp/v.mp4', video, fps=60, qp=18)
>>> show_video(read_video('/tmp/v.mp4'))
Arguments:
  • path: Output video file.
  • images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D arrays.
  • **kwargs: Additional parameters for VideoWriter.
class VideoReader(_VideoIO):
1379class VideoReader(_VideoIO):
1380  """Context to read a compressed video as an iterable over its images.
1381
1382  >>> with VideoReader('/tmp/river.mp4') as reader:
1383  ...   print(f'Video has {reader.num_images} images with shape={reader.shape},'
1384  ...         f' at {reader.fps} frames/sec and {reader.bps} bits/sec.')
1385  ...   for image in reader:
1386  ...     print(image.shape)
1387
1388  >>> with VideoReader('/tmp/river.mp4') as reader:
1389  ...   video = np.array(tuple(reader))
1390
1391  >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
1392  >>> with VideoReader(url) as reader:
1393  ...   show_video(reader)
1394
1395  Attributes:
1396    path_or_url: Location of input video.
1397    output_format: Format of output images (default 'rgb').  If 'rgb', each
1398      image has shape=(height, width, 3) with R, G, B values.  If 'yuv', each
1399      image has shape=(height, width, 3) with Y, U, V values.  If 'gray', each
1400      image has shape=(height, width).
1401    dtype: Data type for output images.  The default is `np.uint8`.  Use of
1402      `np.uint16` allows reading 10-bit or 12-bit data without precision loss.
1403    metadata: Object storing the information retrieved from the video header.
1404      Its attributes are copied as attributes in this class.
1405    num_images: Number of frames that is expected from the video stream.  This
1406      is estimated from the framerate and the duration stored in the video
1407      header, so it might be inexact.
1408    shape: The dimensions (height, width) of each video frame.
1409    fps: The framerate in frames per second.
1410    bps: The estimated bitrate of the video stream in bits per second, retrieved
1411      from the video header.
1412    stream_index: The stream index to read from. The default is 0.
1413    sandbox_max_run_time_secs: The maximum time in seconds to run the sandbox.
1414      If None, the default limit is 30 minutes. Unused in open source.
1415  """
1416
1417  path_or_url: _Path
1418  output_format: str
1419  dtype: _DType
1420  metadata: VideoMetadata
1421  num_images: int
1422  shape: tuple[int, int]
1423  fps: float
1424  bps: int | None
1425  stream_index: int
1426  _num_bytes_per_image: int
1427
1428  def __init__(
1429      self,
1430      path_or_url: _Path,
1431      *,
1432      stream_index: int = 0,
1433      output_format: str = 'rgb',
1434      dtype: _DTypeLike = np.uint8,
1435      sandbox_max_run_time_secs: int | None = None,
1436  ):
1437    if output_format not in {'rgb', 'yuv', 'gray'}:
1438      raise ValueError(
1439          f'Output format {output_format} is not rgb, yuv, or gray.'
1440      )
1441    self.path_or_url = path_or_url
1442    self.output_format = output_format
1443    self.stream_index = stream_index
1444    self.dtype = np.dtype(dtype)
1445    if self.dtype.type not in (np.uint8, np.uint16):
1446      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1447    self.sandbox_max_run_time_secs = sandbox_max_run_time_secs
1448    self._read_via_local_file: Any = None
1449    self._popen: subprocess.Popen[bytes] | None = None
1450    self._proc: subprocess.Popen[bytes] | None = None
1451
1452  def __enter__(self) -> 'VideoReader':
1453    try:
1454      self._read_via_local_file = _read_via_local_file(self.path_or_url)
1455      # pylint: disable-next=no-member
1456      tmp_name = self._read_via_local_file.__enter__()
1457
1458      self.metadata = _get_video_metadata(tmp_name)
1459      self.num_images, self.shape, self.fps, self.bps = self.metadata
1460      pix_fmt = self._get_pix_fmt(self.dtype, self.output_format)
1461      num_channels = {'rgb': 3, 'yuv': 3, 'gray': 1}[self.output_format]
1462      bytes_per_channel = self.dtype.itemsize
1463      self._num_bytes_per_image = (
1464          math.prod(self.shape) * num_channels * bytes_per_channel
1465      )
1466
1467      command = [
1468          '-v',
1469          'panic',
1470          '-nostdin',
1471          '-i',
1472          tmp_name,
1473          '-vcodec',
1474          'rawvideo',
1475          '-f',
1476          'image2pipe',
1477          '-map',
1478          f'0:v:{self.stream_index}',
1479          '-pix_fmt',
1480          pix_fmt,
1481          '-vsync',
1482          'vfr',
1483          '-',
1484      ]
1485      self._popen = _run_ffmpeg(
1486          command,
1487          stdout=subprocess.PIPE,
1488          stderr=subprocess.PIPE,
1489          allowed_input_files=[tmp_name],
1490          sandbox_max_run_time_secs=self.sandbox_max_run_time_secs,
1491      )
1492      self._proc = self._popen.__enter__()
1493    except Exception:
1494      self.__exit__(None, None, None)
1495      raise
1496    return self
1497
1498  def __exit__(self, *_: Any) -> None:
1499    self.close()
1500
1501  def read(self) -> _NDArray | None:
1502    """Reads a video image frame (or None if at end of file).
1503
1504    Returns:
1505      A numpy array in the format specified by `output_format`, i.e., a 3D
1506      array with 3 color channels, except for format 'gray' which is 2D.
1507
1508    Raises:
1509      RuntimeError: If there is an error reading from the output file.
1510    """
1511    assert self._proc, 'Error: reading from an already closed context.'
1512    stdout = self._proc.stdout
1513    assert stdout is not None
1514    data = stdout.read(self._num_bytes_per_image)
1515    if not data:  # Due to either end-of-file or subprocess error.
1516      self.close()  # Raises exception if subprocess had error.
1517      return None  # To indicate end-of-file.
1518    if len(data) != self._num_bytes_per_image:
1519      self._proc.wait()
1520      stderr = self._proc.stderr
1521      stderr_output = ''
1522      if stderr is not None:
1523        stderr_output = stderr.read().decode('utf-8', errors='replace').strip()
1524      raise RuntimeError(
1525          f'ffmpeg exited with code {self._proc.returncode}.\nIncomplete'
1526          f' frame read: expected {self._num_bytes_per_image} bytes, but got'
1527          f' {len(data)}.\nffmpeg stderr:\n{stderr_output}'
1528      )
1529    image = np.frombuffer(data, dtype=self.dtype)
1530    if self.output_format == 'rgb':
1531      image = image.reshape(*self.shape, 3)
1532    elif self.output_format == 'yuv':  # Convert from planar YUV to pixel YUV.
1533      image = np.moveaxis(image.reshape(3, *self.shape), 0, 2)
1534    elif self.output_format == 'gray':  # Generate 2D rather than 3D ndimage.
1535      image = image.reshape(*self.shape)
1536    else:
1537      raise AssertionError
1538    return image
1539
1540  def __iter__(self) -> Iterator[_NDArray]:
1541    while True:
1542      image = self.read()
1543      if image is None:
1544        return
1545      yield image
1546
1547  def close(self) -> None:
1548    """Terminates video reader.  (Called automatically at end of context.)"""
1549    if self._popen:
1550      self._popen.__exit__(None, None, None)
1551      self._popen = None
1552      self._proc = None
1553    if self._read_via_local_file:
1554      # pylint: disable-next=no-member
1555      self._read_via_local_file.__exit__(None, None, None)
1556      self._read_via_local_file = None

Context to read a compressed video as an iterable over its images.

>>> with VideoReader('/tmp/river.mp4') as reader:
...   print(f'Video has {reader.num_images} images with shape={reader.shape},'
...         f' at {reader.fps} frames/sec and {reader.bps} bits/sec.')
...   for image in reader:
...     print(image.shape)
>>> with VideoReader('/tmp/river.mp4') as reader:
...   video = np.array(tuple(reader))
>>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
>>> with VideoReader(url) as reader:
...   show_video(reader)
Attributes:
  • path_or_url: Location of input video.
  • output_format: Format of output images (default 'rgb'). If 'rgb', each image has shape=(height, width, 3) with R, G, B values. If 'yuv', each image has shape=(height, width, 3) with Y, U, V values. If 'gray', each image has shape=(height, width).
  • dtype: Data type for output images. The default is np.uint8. Use of np.uint16 allows reading 10-bit or 12-bit data without precision loss.
  • metadata: Object storing the information retrieved from the video header. Its attributes are copied as attributes in this class.
  • num_images: Number of frames that is expected from the video stream. This is estimated from the framerate and the duration stored in the video header, so it might be inexact.
  • shape: The dimensions (height, width) of each video frame.
  • fps: The framerate in frames per second.
  • bps: The estimated bitrate of the video stream in bits per second, retrieved from the video header.
  • stream_index: The stream index to read from. The default is 0.
  • sandbox_max_run_time_secs: The maximum time in seconds to run the sandbox. If None, the default limit is 30 minutes. Unused in open source.
VideoReader( path_or_url: str | os.PathLike[str], *, stream_index: int = 0, output_format: str = 'rgb', dtype: DTypeLike = <class 'numpy.uint8'>, sandbox_max_run_time_secs: int | None = None)
1428  def __init__(
1429      self,
1430      path_or_url: _Path,
1431      *,
1432      stream_index: int = 0,
1433      output_format: str = 'rgb',
1434      dtype: _DTypeLike = np.uint8,
1435      sandbox_max_run_time_secs: int | None = None,
1436  ):
1437    if output_format not in {'rgb', 'yuv', 'gray'}:
1438      raise ValueError(
1439          f'Output format {output_format} is not rgb, yuv, or gray.'
1440      )
1441    self.path_or_url = path_or_url
1442    self.output_format = output_format
1443    self.stream_index = stream_index
1444    self.dtype = np.dtype(dtype)
1445    if self.dtype.type not in (np.uint8, np.uint16):
1446      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1447    self.sandbox_max_run_time_secs = sandbox_max_run_time_secs
1448    self._read_via_local_file: Any = None
1449    self._popen: subprocess.Popen[bytes] | None = None
1450    self._proc: subprocess.Popen[bytes] | None = None
path_or_url: str | os.PathLike[str]
output_format: str
dtype: ~_DType
metadata: VideoMetadata
num_images: int
shape: tuple[int, int]
fps: float
bps: int | None
stream_index: int
sandbox_max_run_time_secs
def read(self) -> np.ndarray | None:
1501  def read(self) -> _NDArray | None:
1502    """Reads a video image frame (or None if at end of file).
1503
1504    Returns:
1505      A numpy array in the format specified by `output_format`, i.e., a 3D
1506      array with 3 color channels, except for format 'gray' which is 2D.
1507
1508    Raises:
1509      RuntimeError: If there is an error reading from the output file.
1510    """
1511    assert self._proc, 'Error: reading from an already closed context.'
1512    stdout = self._proc.stdout
1513    assert stdout is not None
1514    data = stdout.read(self._num_bytes_per_image)
1515    if not data:  # Due to either end-of-file or subprocess error.
1516      self.close()  # Raises exception if subprocess had error.
1517      return None  # To indicate end-of-file.
1518    if len(data) != self._num_bytes_per_image:
1519      self._proc.wait()
1520      stderr = self._proc.stderr
1521      stderr_output = ''
1522      if stderr is not None:
1523        stderr_output = stderr.read().decode('utf-8', errors='replace').strip()
1524      raise RuntimeError(
1525          f'ffmpeg exited with code {self._proc.returncode}.\nIncomplete'
1526          f' frame read: expected {self._num_bytes_per_image} bytes, but got'
1527          f' {len(data)}.\nffmpeg stderr:\n{stderr_output}'
1528      )
1529    image = np.frombuffer(data, dtype=self.dtype)
1530    if self.output_format == 'rgb':
1531      image = image.reshape(*self.shape, 3)
1532    elif self.output_format == 'yuv':  # Convert from planar YUV to pixel YUV.
1533      image = np.moveaxis(image.reshape(3, *self.shape), 0, 2)
1534    elif self.output_format == 'gray':  # Generate 2D rather than 3D ndimage.
1535      image = image.reshape(*self.shape)
1536    else:
1537      raise AssertionError
1538    return image

Reads a video image frame (or None if at end of file).

Returns:

A numpy array in the format specified by output_format, i.e., a 3D array with 3 color channels, except for format 'gray' which is 2D.

Raises:
  • RuntimeError: If there is an error reading from the output file.
def close(self) -> None:
1547  def close(self) -> None:
1548    """Terminates video reader.  (Called automatically at end of context.)"""
1549    if self._popen:
1550      self._popen.__exit__(None, None, None)
1551      self._popen = None
1552      self._proc = None
1553    if self._read_via_local_file:
1554      # pylint: disable-next=no-member
1555      self._read_via_local_file.__exit__(None, None, None)
1556      self._read_via_local_file = None

Terminates video reader. (Called automatically at end of context.)

class VideoWriter(_VideoIO):
1559class VideoWriter(_VideoIO):
1560  """Context to write a compressed video.
1561
1562  >>> shape = 480, 640
1563  >>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer:
1564  ...   for image in moving_circle(shape, num_images=60):
1565  ...     writer.add_image(image)
1566  >>> show_video(read_video('/tmp/v.mp4'))
1567
1568
1569  Bitrate control may be specified using at most one of: `bps`, `qp`, or `crf`.
1570  If none are specified, `qp` is set to a default value.
1571  See https://slhck.info/video/2017/03/01/rate-control.html
1572
1573  If codec is 'gif', the args `bps`, `qp`, `crf`, and `encoded_format` are
1574  ignored.
1575
1576  Attributes:
1577    path: Output video.  Its suffix (e.g. '.mp4') determines the video container
1578      format.  The suffix must be '.gif' if the codec is 'gif'.
1579    shape: 2D spatial dimensions (height, width) of video image frames.  The
1580      dimensions must be even if 'encoded_format' has subsampled chroma (e.g.,
1581      'yuv420p' or 'yuv420p10le').
1582    codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264',
1583      'hevc', 'vp9', or 'gif').
1584    metadata: Optional VideoMetadata object whose `fps` and `bps` attributes are
1585      used if not specified as explicit parameters.
1586    fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif').
1587    bps: Requested average bits-per-second bitrate (default None).
1588    qp: Quantization parameter for video compression quality (default None).
1589    crf: Constant rate factor for video compression quality (default None).
1590    ffmpeg_args: Additional arguments for `ffmpeg` command, e.g. '-g 30' to
1591      introduce I-frames, or '-bf 0' to omit B-frames.
1592    input_format: Format of input images (default 'rgb').  If 'rgb', each image
1593      has shape=(height, width, 3) or (height, width).  If 'yuv', each image has
1594      shape=(height, width, 3) with Y, U, V values.  If 'gray', each image has
1595      shape=(height, width).
1596    dtype: Expected data type for input images (any float input images are
1597      converted to `dtype`).  The default is `np.uint8`.  Use of `np.uint16` is
1598      necessary when encoding >8 bits/channel.
1599    encoded_format: Pixel format as defined by `ffmpeg -pix_fmts`, e.g.,
1600      'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma),
1601      'yuv420p10le' (10-bit per channel), etc.  The default (None) selects
1602      'yuv420p' if all shape dimensions are even, else 'yuv444p'.
1603    sandbox_max_run_time_secs: The maximum time in seconds to run the sandbox.
1604      If None, the default limit is 30 minutes. Unused in open source.
1605  """
1606
1607  def __init__(
1608      self,
1609      path: _Path,
1610      shape: tuple[int, int],
1611      *,
1612      codec: str = 'h264',
1613      metadata: VideoMetadata | None = None,
1614      fps: float | None = None,
1615      bps: int | None = None,
1616      qp: int | None = None,
1617      crf: float | None = None,
1618      ffmpeg_args: str | Sequence[str] = '',
1619      input_format: str = 'rgb',
1620      dtype: _DTypeLike = np.uint8,
1621      encoded_format: str | None = None,
1622      sandbox_max_run_time_secs: int | None = None,
1623  ) -> None:
1624    _check_2d_shape(shape)
1625    if fps is None and metadata:
1626      fps = metadata.fps
1627    if fps is None:
1628      fps = 25.0 if codec == 'gif' else 60.0
1629    if fps <= 0.0:
1630      raise ValueError(f'Frame-per-second value {fps} is invalid.')
1631    if bps is None and metadata:
1632      bps = metadata.bps
1633    bps = int(bps) if bps is not None else None
1634    if bps is not None and bps <= 0:
1635      raise ValueError(f'Bitrate value {bps} is invalid.')
1636    if qp is not None and (not isinstance(qp, int) or qp < 0):
1637      raise ValueError(
1638          f'Quantization parameter {qp} cannot be negative. It must be a'
1639          ' non-negative integer.'
1640      )
1641    num_rate_specifications = sum(x is not None for x in (bps, qp, crf))
1642    if num_rate_specifications > 1:
1643      raise ValueError(
1644          f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).'
1645      )
1646    ffmpeg_args = (
1647        shlex.split(ffmpeg_args)
1648        if isinstance(ffmpeg_args, str)
1649        else list(ffmpeg_args)
1650    )
1651    if input_format not in {'rgb', 'yuv', 'gray'}:
1652      raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.')
1653    dtype = np.dtype(dtype)
1654    if dtype.type not in (np.uint8, np.uint16):
1655      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1656    self.path = pathlib.Path(path)
1657    self.shape = shape
1658    all_dimensions_are_even = all(dim % 2 == 0 for dim in shape)
1659    if encoded_format is None:
1660      encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p'
1661    if not all_dimensions_are_even and encoded_format.startswith(
1662        ('yuv42', 'yuvj42')
1663    ):
1664      raise ValueError(
1665          f'With encoded_format {encoded_format}, video dimensions must be'
1666          f' even, but shape is {shape}.'
1667      )
1668    self.fps = fps
1669    self.codec = codec
1670    self.bps = bps
1671    self.qp = qp
1672    self.crf = crf
1673    self.ffmpeg_args = ffmpeg_args
1674    self.input_format = input_format
1675    self.dtype = dtype
1676    self.encoded_format = encoded_format
1677    self.sandbox_max_run_time_secs = sandbox_max_run_time_secs
1678    if num_rate_specifications == 0 and not ffmpeg_args:
1679      qp = 20 if math.prod(self.shape) <= 640 * 480 else 28
1680    self._bitrate_args = (
1681        (['-vb', f'{bps}'] if bps is not None else [])
1682        + (['-qp', f'{qp}'] if qp is not None else [])
1683        + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else [])
1684    )
1685    if self.codec == 'gif':
1686      if self.path.suffix != '.gif':
1687        raise ValueError(f"File '{self.path}' does not have a .gif suffix.")
1688      self.encoded_format = 'pal8'
1689      self._bitrate_args = []
1690      video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse'
1691      # Less common (and likely less useful) is a per-frame color palette:
1692      # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];'
1693      #                 '[s1][p]paletteuse=new=1')
1694      self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args
1695    self._write_via_local_file: Any = None
1696    self._popen: subprocess.Popen[bytes] | None = None
1697    self._proc: subprocess.Popen[bytes] | None = None
1698
1699  def __enter__(self) -> 'VideoWriter':
1700    input_pix_fmt = self._get_pix_fmt(self.dtype, self.input_format)
1701    try:
1702      self._write_via_local_file = _write_via_local_file(self.path)
1703      # pylint: disable-next=no-member
1704      tmp_name = self._write_via_local_file.__enter__()
1705
1706      # Writing to stdout using ('-f', 'mp4', '-') would require
1707      # ('-movflags', 'frag_keyframe+empty_moov') and the result is nonportable.
1708      height, width = self.shape
1709      command = (
1710          [
1711              '-v',
1712              'error',
1713              '-f',
1714              'rawvideo',
1715              '-vcodec',
1716              'rawvideo',
1717              '-pix_fmt',
1718              input_pix_fmt,
1719              '-s',
1720              f'{width}x{height}',
1721              '-r',
1722              f'{self.fps}',
1723              '-i',
1724              '-',
1725              '-an',
1726              '-vcodec',
1727              self.codec,
1728              '-pix_fmt',
1729              self.encoded_format,
1730          ]
1731          + self._bitrate_args
1732          + self.ffmpeg_args
1733          + ['-y', tmp_name]
1734      )
1735      self._popen = _run_ffmpeg(
1736          command,
1737          stdin=subprocess.PIPE,
1738          stderr=subprocess.PIPE,
1739          allowed_output_files=[tmp_name],
1740          sandbox_max_run_time_secs=self.sandbox_max_run_time_secs,
1741      )
1742      self._proc = self._popen.__enter__()
1743    except Exception:
1744      self.__exit__(None, None, None)
1745      raise
1746    return self
1747
1748  def __exit__(self, *_: Any) -> None:
1749    self.close()
1750
1751  def add_image(self, image: _NDArray) -> None:
1752    """Writes a video frame.
1753
1754    Args:
1755      image: Array whose dtype and first two dimensions must match the `dtype`
1756        and `shape` specified in `VideoWriter` initialization.  If
1757        `input_format` is 'gray', the image must be 2D.  For the 'rgb'
1758        input_format, the image may be either 2D (interpreted as grayscale) or
1759        3D with three (R, G, B) channels.  For the 'yuv' input_format, the image
1760        must be 3D with three (Y, U, V) channels.
1761
1762    Raises:
1763      RuntimeError: If there is an error writing to the output file.
1764    """
1765    assert self._proc, 'Error: writing to an already closed context.'
1766    if issubclass(image.dtype.type, (np.floating, np.bool_)):
1767      image = to_type(image, self.dtype)
1768    if image.dtype != self.dtype:
1769      raise ValueError(f'Image type {image.dtype} != {self.dtype}.')
1770    if self.input_format == 'gray':
1771      if image.ndim != 2:
1772        raise ValueError(f'Image dimensions {image.shape} are not 2D.')
1773    else:
1774      if image.ndim == 2 and self.input_format == 'rgb':
1775        image = np.dstack((image, image, image))
1776      if not (image.ndim == 3 and image.shape[2] == 3):
1777        raise ValueError(f'Image dimensions {image.shape} are invalid.')
1778    if image.shape[:2] != self.shape:
1779      raise ValueError(
1780          f'Image dimensions {image.shape[:2]} do not match'
1781          f' those of the initialized video {self.shape}.'
1782      )
1783    if self.input_format == 'yuv':  # Convert from per-pixel YUV to planar YUV.
1784      image = np.moveaxis(image, 2, 0)
1785    data = image.tobytes()
1786    stdin = self._proc.stdin
1787    assert stdin is not None
1788    if stdin.write(data) != len(data):
1789      self._proc.wait()
1790      stderr = self._proc.stderr
1791      assert stderr is not None
1792      s = stderr.read().decode('utf-8')
1793      raise RuntimeError(f"Error writing '{self.path}': {s}")
1794
1795  def close(self) -> None:
1796    """Finishes writing the video.  (Called automatically at end of context.)"""
1797    if self._popen:
1798      assert self._proc, 'Error: closing an already closed context.'
1799      stdin = self._proc.stdin
1800      assert stdin is not None
1801      stdin.close()
1802      if self._proc.wait():
1803        stderr = self._proc.stderr
1804        assert stderr is not None
1805        s = stderr.read().decode('utf-8')
1806        raise RuntimeError(f"Error writing '{self.path}': {s}")
1807      self._popen.__exit__(None, None, None)
1808      self._popen = None
1809      self._proc = None
1810    if self._write_via_local_file:
1811      # pylint: disable-next=no-member
1812      self._write_via_local_file.__exit__(None, None, None)
1813      self._write_via_local_file = None

Context to write a compressed video.

>>> shape = 480, 640
>>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer:
...   for image in moving_circle(shape, num_images=60):
...     writer.add_image(image)
>>> show_video(read_video('/tmp/v.mp4'))

Bitrate control may be specified using at most one of: bps, qp, or crf. If none are specified, qp is set to a default value. See https://slhck.info/video/2017/03/01/rate-control.html

If codec is 'gif', the args bps, qp, crf, and encoded_format are ignored.

Attributes:
  • path: Output video. Its suffix (e.g. '.mp4') determines the video container format. The suffix must be '.gif' if the codec is 'gif'.
  • shape: 2D spatial dimensions (height, width) of video image frames. The dimensions must be even if 'encoded_format' has subsampled chroma (e.g., 'yuv420p' or 'yuv420p10le').
  • codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264', 'hevc', 'vp9', or 'gif').
  • metadata: Optional VideoMetadata object whose fps and bps attributes are used if not specified as explicit parameters.
  • fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif').
  • bps: Requested average bits-per-second bitrate (default None).
  • qp: Quantization parameter for video compression quality (default None).
  • crf: Constant rate factor for video compression quality (default None).
  • ffmpeg_args: Additional arguments for ffmpeg command, e.g. '-g 30' to introduce I-frames, or '-bf 0' to omit B-frames.
  • input_format: Format of input images (default 'rgb'). If 'rgb', each image has shape=(height, width, 3) or (height, width). If 'yuv', each image has shape=(height, width, 3) with Y, U, V values. If 'gray', each image has shape=(height, width).
  • dtype: Expected data type for input images (any float input images are converted to dtype). The default is np.uint8. Use of np.uint16 is necessary when encoding >8 bits/channel.
  • encoded_format: Pixel format as defined by ffmpeg -pix_fmts, e.g., 'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma), 'yuv420p10le' (10-bit per channel), etc. The default (None) selects 'yuv420p' if all shape dimensions are even, else 'yuv444p'.
  • sandbox_max_run_time_secs: The maximum time in seconds to run the sandbox. If None, the default limit is 30 minutes. Unused in open source.
VideoWriter( path: str | os.PathLike[str], shape: tuple[int, int], *, codec: str = 'h264', metadata: VideoMetadata | None = None, fps: float | None = None, bps: int | None = None, qp: int | None = None, crf: float | None = None, ffmpeg_args: str | Sequence[str] = '', input_format: str = 'rgb', dtype: DTypeLike = <class 'numpy.uint8'>, encoded_format: str | None = None, sandbox_max_run_time_secs: int | None = None)
1607  def __init__(
1608      self,
1609      path: _Path,
1610      shape: tuple[int, int],
1611      *,
1612      codec: str = 'h264',
1613      metadata: VideoMetadata | None = None,
1614      fps: float | None = None,
1615      bps: int | None = None,
1616      qp: int | None = None,
1617      crf: float | None = None,
1618      ffmpeg_args: str | Sequence[str] = '',
1619      input_format: str = 'rgb',
1620      dtype: _DTypeLike = np.uint8,
1621      encoded_format: str | None = None,
1622      sandbox_max_run_time_secs: int | None = None,
1623  ) -> None:
1624    _check_2d_shape(shape)
1625    if fps is None and metadata:
1626      fps = metadata.fps
1627    if fps is None:
1628      fps = 25.0 if codec == 'gif' else 60.0
1629    if fps <= 0.0:
1630      raise ValueError(f'Frame-per-second value {fps} is invalid.')
1631    if bps is None and metadata:
1632      bps = metadata.bps
1633    bps = int(bps) if bps is not None else None
1634    if bps is not None and bps <= 0:
1635      raise ValueError(f'Bitrate value {bps} is invalid.')
1636    if qp is not None and (not isinstance(qp, int) or qp < 0):
1637      raise ValueError(
1638          f'Quantization parameter {qp} cannot be negative. It must be a'
1639          ' non-negative integer.'
1640      )
1641    num_rate_specifications = sum(x is not None for x in (bps, qp, crf))
1642    if num_rate_specifications > 1:
1643      raise ValueError(
1644          f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).'
1645      )
1646    ffmpeg_args = (
1647        shlex.split(ffmpeg_args)
1648        if isinstance(ffmpeg_args, str)
1649        else list(ffmpeg_args)
1650    )
1651    if input_format not in {'rgb', 'yuv', 'gray'}:
1652      raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.')
1653    dtype = np.dtype(dtype)
1654    if dtype.type not in (np.uint8, np.uint16):
1655      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1656    self.path = pathlib.Path(path)
1657    self.shape = shape
1658    all_dimensions_are_even = all(dim % 2 == 0 for dim in shape)
1659    if encoded_format is None:
1660      encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p'
1661    if not all_dimensions_are_even and encoded_format.startswith(
1662        ('yuv42', 'yuvj42')
1663    ):
1664      raise ValueError(
1665          f'With encoded_format {encoded_format}, video dimensions must be'
1666          f' even, but shape is {shape}.'
1667      )
1668    self.fps = fps
1669    self.codec = codec
1670    self.bps = bps
1671    self.qp = qp
1672    self.crf = crf
1673    self.ffmpeg_args = ffmpeg_args
1674    self.input_format = input_format
1675    self.dtype = dtype
1676    self.encoded_format = encoded_format
1677    self.sandbox_max_run_time_secs = sandbox_max_run_time_secs
1678    if num_rate_specifications == 0 and not ffmpeg_args:
1679      qp = 20 if math.prod(self.shape) <= 640 * 480 else 28
1680    self._bitrate_args = (
1681        (['-vb', f'{bps}'] if bps is not None else [])
1682        + (['-qp', f'{qp}'] if qp is not None else [])
1683        + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else [])
1684    )
1685    if self.codec == 'gif':
1686      if self.path.suffix != '.gif':
1687        raise ValueError(f"File '{self.path}' does not have a .gif suffix.")
1688      self.encoded_format = 'pal8'
1689      self._bitrate_args = []
1690      video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse'
1691      # Less common (and likely less useful) is a per-frame color palette:
1692      # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];'
1693      #                 '[s1][p]paletteuse=new=1')
1694      self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args
1695    self._write_via_local_file: Any = None
1696    self._popen: subprocess.Popen[bytes] | None = None
1697    self._proc: subprocess.Popen[bytes] | None = None
path
shape
fps
codec
bps
qp
crf
ffmpeg_args
input_format
dtype
encoded_format
sandbox_max_run_time_secs
def add_image(self, image: np.ndarray) -> None:
1751  def add_image(self, image: _NDArray) -> None:
1752    """Writes a video frame.
1753
1754    Args:
1755      image: Array whose dtype and first two dimensions must match the `dtype`
1756        and `shape` specified in `VideoWriter` initialization.  If
1757        `input_format` is 'gray', the image must be 2D.  For the 'rgb'
1758        input_format, the image may be either 2D (interpreted as grayscale) or
1759        3D with three (R, G, B) channels.  For the 'yuv' input_format, the image
1760        must be 3D with three (Y, U, V) channels.
1761
1762    Raises:
1763      RuntimeError: If there is an error writing to the output file.
1764    """
1765    assert self._proc, 'Error: writing to an already closed context.'
1766    if issubclass(image.dtype.type, (np.floating, np.bool_)):
1767      image = to_type(image, self.dtype)
1768    if image.dtype != self.dtype:
1769      raise ValueError(f'Image type {image.dtype} != {self.dtype}.')
1770    if self.input_format == 'gray':
1771      if image.ndim != 2:
1772        raise ValueError(f'Image dimensions {image.shape} are not 2D.')
1773    else:
1774      if image.ndim == 2 and self.input_format == 'rgb':
1775        image = np.dstack((image, image, image))
1776      if not (image.ndim == 3 and image.shape[2] == 3):
1777        raise ValueError(f'Image dimensions {image.shape} are invalid.')
1778    if image.shape[:2] != self.shape:
1779      raise ValueError(
1780          f'Image dimensions {image.shape[:2]} do not match'
1781          f' those of the initialized video {self.shape}.'
1782      )
1783    if self.input_format == 'yuv':  # Convert from per-pixel YUV to planar YUV.
1784      image = np.moveaxis(image, 2, 0)
1785    data = image.tobytes()
1786    stdin = self._proc.stdin
1787    assert stdin is not None
1788    if stdin.write(data) != len(data):
1789      self._proc.wait()
1790      stderr = self._proc.stderr
1791      assert stderr is not None
1792      s = stderr.read().decode('utf-8')
1793      raise RuntimeError(f"Error writing '{self.path}': {s}")

Writes a video frame.

Arguments:
  • image: Array whose dtype and first two dimensions must match the dtype and shape specified in VideoWriter initialization. If input_format is 'gray', the image must be 2D. For the 'rgb' input_format, the image may be either 2D (interpreted as grayscale) or 3D with three (R, G, B) channels. For the 'yuv' input_format, the image must be 3D with three (Y, U, V) channels.
Raises:
  • RuntimeError: If there is an error writing to the output file.
def close(self) -> None:
1795  def close(self) -> None:
1796    """Finishes writing the video.  (Called automatically at end of context.)"""
1797    if self._popen:
1798      assert self._proc, 'Error: closing an already closed context.'
1799      stdin = self._proc.stdin
1800      assert stdin is not None
1801      stdin.close()
1802      if self._proc.wait():
1803        stderr = self._proc.stderr
1804        assert stderr is not None
1805        s = stderr.read().decode('utf-8')
1806        raise RuntimeError(f"Error writing '{self.path}': {s}")
1807      self._popen.__exit__(None, None, None)
1808      self._popen = None
1809      self._proc = None
1810    if self._write_via_local_file:
1811      # pylint: disable-next=no-member
1812      self._write_via_local_file.__exit__(None, None, None)
1813      self._write_via_local_file = None

Finishes writing the video. (Called automatically at end of context.)

class VideoMetadata(typing.NamedTuple):
1274class VideoMetadata(NamedTuple):
1275  """Represents the data stored in a video container header.
1276
1277  Attributes:
1278    num_images: Number of frames that is expected from the video stream.  This
1279      is estimated from the framerate and the duration stored in the video
1280      header, so it might be inexact.  We set the value to -1 if number of
1281      frames is not found in the header.
1282    shape: The dimensions (height, width) of each video frame.
1283    fps: The framerate in frames per second.
1284    bps: The estimated bitrate of the video stream in bits per second, retrieved
1285      from the video header.
1286  """
1287
1288  num_images: int
1289  shape: tuple[int, int]
1290  fps: float
1291  bps: int | None

Represents the data stored in a video container header.

Attributes:
  • num_images: Number of frames that is expected from the video stream. This is estimated from the framerate and the duration stored in the video header, so it might be inexact. We set the value to -1 if number of frames is not found in the header.
  • shape: The dimensions (height, width) of each video frame.
  • fps: The framerate in frames per second.
  • bps: The estimated bitrate of the video stream in bits per second, retrieved from the video header.
def compress_image(image: ArrayLike, *, fmt: str = 'png', **kwargs: Any) -> bytes:
859def compress_image(
860    image: _ArrayLike, *, fmt: str = 'png', **kwargs: Any
861) -> bytes:
862  """Returns a buffer containing a compressed image.
863
864  Args:
865    image: Array in a format supported by `PIL`, e.g. np.uint8 or np.uint16.
866    fmt: Desired compression encoding, e.g. 'png'.
867    **kwargs: Options for `PIL.save()`, e.g. `optimize=True` for greater
868      compression.
869  """
870  image = _as_valid_media_array(image)
871  with io.BytesIO() as output:
872    _pil_image(image).save(output, format=fmt, **kwargs)
873    return output.getvalue()

Returns a buffer containing a compressed image.

Arguments:
  • image: Array in a format supported by PIL, e.g. np.uint8 or np.uint16.
  • fmt: Desired compression encoding, e.g. 'png'.
  • **kwargs: Options for PIL.save(), e.g. optimize=True for greater compression.
def decompress_image( data: bytes, dtype: DTypeLike = None, apply_exif_transpose: bool = True) -> np.ndarray:
876def decompress_image(
877    data: bytes, dtype: _DTypeLike = None, apply_exif_transpose: bool = True
878) -> _NDArray:
879  """Returns an image from a compressed data buffer.
880
881  Decoding is performed using `PIL`, which supports `uint8` images with 1, 3,
882  or 4 channels and `uint16` images with a single channel.
883
884  Args:
885    data: Buffer containing compressed image.
886    dtype: Data type of the returned array.  If None, `np.uint8` or `np.uint16`
887      is inferred automatically.
888    apply_exif_transpose: If True, rotate image according to EXIF orientation.
889  """
890  pil_image: PIL.Image.Image = PIL.Image.open(io.BytesIO(data))
891  if apply_exif_transpose:
892    tmp_image = PIL.ImageOps.exif_transpose(pil_image)  # Future: in_place=True.
893    assert tmp_image
894    pil_image = tmp_image
895  if dtype is None:
896    dtype = np.uint16 if pil_image.mode.startswith('I') else np.uint8
897  return np.array(pil_image, dtype=dtype)

Returns an image from a compressed data buffer.

Decoding is performed using PIL, which supports uint8 images with 1, 3, or 4 channels and uint16 images with a single channel.

Arguments:
  • data: Buffer containing compressed image.
  • dtype: Data type of the returned array. If None, np.uint8 or np.uint16 is inferred automatically.
  • apply_exif_transpose: If True, rotate image according to EXIF orientation.
def compress_video( images: Iterable[np.ndarray], *, codec: str = 'h264', **kwargs: Any) -> bytes:
1887def compress_video(
1888    images: Iterable[_NDArray], *, codec: str = 'h264', **kwargs: Any
1889) -> bytes:
1890  """Returns a buffer containing a compressed video.
1891
1892  The video container is 'gif' for 'gif' codec, 'webm' for 'vp9' codec,
1893  and mp4 otherwise.
1894
1895  >>> video = read_video('/tmp/river.mp4')
1896  >>> data = compress_video(video, bps=10_000_000)
1897  >>> print(len(data))
1898
1899  >>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10)
1900
1901  Args:
1902    images: Iterable over video frames.
1903    codec: Compression algorithm as defined by `ffmpeg -codecs` (e.g., 'h264',
1904      'hevc', 'vp9', or 'gif').
1905    **kwargs: Additional parameters for `VideoWriter`.
1906
1907  Returns:
1908    A bytes buffer containing the compressed video.
1909  """
1910  suffix = _filename_suffix_from_codec(codec)
1911  with tempfile.TemporaryDirectory() as directory_name:
1912    tmp_path = pathlib.Path(directory_name) / f'file{suffix}'
1913    write_video(tmp_path, images, codec=codec, **kwargs)
1914    return tmp_path.read_bytes()

Returns a buffer containing a compressed video.

The video container is 'gif' for 'gif' codec, 'webm' for 'vp9' codec, and mp4 otherwise.

>>> video = read_video('/tmp/river.mp4')
>>> data = compress_video(video, bps=10_000_000)
>>> print(len(data))
>>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10)
Arguments:
  • images: Iterable over video frames.
  • codec: Compression algorithm as defined by ffmpeg -codecs (e.g., 'h264', 'hevc', 'vp9', or 'gif').
  • **kwargs: Additional parameters for VideoWriter.
Returns:

A bytes buffer containing the compressed video.

def decompress_video(data: bytes, **kwargs: Any) -> np.ndarray:
1917def decompress_video(data: bytes, **kwargs: Any) -> _NDArray:
1918  """Returns video images from an MP4-compressed data buffer."""
1919  with tempfile.TemporaryDirectory() as directory_name:
1920    tmp_path = pathlib.Path(directory_name) / 'file.mp4'
1921    tmp_path.write_bytes(data)
1922    return read_video(tmp_path, **kwargs)

Returns video images from an MP4-compressed data buffer.

def html_from_compressed_image( data: bytes, width: int, height: int, *, title: str | None = None, border: bool | str = False, pixelated: bool = True, fmt: str = 'png') -> str:
900def html_from_compressed_image(
901    data: bytes,
902    width: int,
903    height: int,
904    *,
905    title: str | None = None,
906    border: bool | str = False,
907    pixelated: bool = True,
908    fmt: str = 'png',
909) -> str:
910  """Returns an HTML string with an image tag containing encoded data.
911
912  Args:
913    data: Compressed image bytes.
914    width: Width of HTML image in pixels.
915    height: Height of HTML image in pixels.
916    title: Optional text shown centered above image.
917    border: If `bool`, whether to place a black boundary around the image, or if
918      `str`, the boundary CSS style.
919    pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'.
920    fmt: Compression encoding.
921  """
922  b64 = base64.b64encode(data).decode('utf-8')
923  if isinstance(border, str):
924    border = f'{border}; '
925  elif border:
926    border = 'border:1px solid black; '
927  else:
928    border = ''
929  s_pixelated = 'pixelated' if pixelated else 'auto'
930  s = (
931      f'<img width="{width}" height="{height}"'
932      f' style="{border}image-rendering:{s_pixelated}; object-fit:cover;"'
933      f' src="data:image/{fmt};base64,{b64}"/>'
934  )
935  if title is not None:
936    s = f"""<div style="display:flex; align-items:left;">
937      <div style="display:flex; flex-direction:column; align-items:center;">
938      <div>{title}</div><div>{s}</div></div></div>"""
939  return s

Returns an HTML string with an image tag containing encoded data.

Arguments:
  • data: Compressed image bytes.
  • width: Width of HTML image in pixels.
  • height: Height of HTML image in pixels.
  • title: Optional text shown centered above image.
  • border: If bool, whether to place a black boundary around the image, or if str, the boundary CSS style.
  • pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'.
  • fmt: Compression encoding.
def html_from_compressed_video( data: bytes, width: int, height: int, *, title: str | None = None, border: bool | str = False, loop: bool = True, autoplay: bool = True) -> str:
1925def html_from_compressed_video(
1926    data: bytes,
1927    width: int,
1928    height: int,
1929    *,
1930    title: str | None = None,
1931    border: bool | str = False,
1932    loop: bool = True,
1933    autoplay: bool = True,
1934) -> str:
1935  """Returns an HTML string with a video tag containing H264-encoded data.
1936
1937  Args:
1938    data: MP4-compressed video bytes.
1939    width: Width of HTML video in pixels.
1940    height: Height of HTML video in pixels.
1941    title: Optional text shown centered above the video.
1942    border: If `bool`, whether to place a black boundary around the image, or if
1943      `str`, the boundary CSS style.
1944    loop: If True, the playback repeats forever.
1945    autoplay: If True, video playback starts without having to click.
1946  """
1947  b64 = base64.b64encode(data).decode('utf-8')
1948  if isinstance(border, str):
1949    border = f'{border}; '
1950  elif border:
1951    border = 'border:1px solid black; '
1952  else:
1953    border = ''
1954  options = (
1955      f'controls width="{width}" height="{height}"'
1956      f' style="{border}object-fit:cover;"'
1957      f'{" loop" if loop else ""}'
1958      f'{" autoplay muted" if autoplay else ""}'
1959  )
1960  s = f"""<video {options}>
1961      <source src="data:video/mp4;base64,{b64}" type="video/mp4"/>
1962      This browser does not support the video tag.
1963      </video>"""
1964  if title is not None:
1965    s = f"""<div style="display:flex; align-items:left;">
1966      <div style="display:flex; flex-direction:column; align-items:center;">
1967      <div>{title}</div><div>{s}</div></div></div>"""
1968  return s

Returns an HTML string with a video tag containing H264-encoded data.

Arguments:
  • data: MP4-compressed video bytes.
  • width: Width of HTML video in pixels.
  • height: Height of HTML video in pixels.
  • title: Optional text shown centered above the video.
  • border: If bool, whether to place a black boundary around the image, or if str, the boundary CSS style.
  • loop: If True, the playback repeats forever.
  • autoplay: If True, video playback starts without having to click.
def resize_image(image: ArrayLike, shape: tuple[int, int]) -> np.ndarray:
615def resize_image(image: _ArrayLike, shape: tuple[int, int]) -> _NDArray:
616  """Resizes image to specified spatial dimensions using a Lanczos filter.
617
618  Args:
619    image: Array-like 2D or 3D object, where dtype is uint or floating-point.
620    shape: 2D spatial dimensions (height, width) of output image.
621
622  Returns:
623    A resampled image whose spatial dimensions match `shape`.
624  """
625  image = _as_valid_media_array(image)
626  if image.ndim not in (2, 3):
627    raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.')
628  _check_2d_shape(shape)
629
630  # A PIL image can be multichannel only if it has 3 or 4 uint8 channels,
631  # and it can be resized only if it is uint8 or float32.
632  supported_single_channel = (
633      np.issubdtype(image.dtype, np.floating) or image.dtype == np.uint8
634  ) and image.ndim == 2
635  supported_multichannel = (
636      image.dtype == np.uint8 and image.ndim == 3 and image.shape[2] in (3, 4)
637  )
638  if supported_single_channel or supported_multichannel:
639    return np.array(
640        _pil_image(image).resize(
641            shape[::-1], resample=PIL.Image.Resampling.LANCZOS
642        ),
643        dtype=image.dtype,
644    )
645  if image.ndim == 2:
646    # We convert to floating-point for resizing and convert back.
647    return to_type(resize_image(to_float01(image), shape), image.dtype)
648  # We resize each image channel individually.
649  return np.dstack(
650      [resize_image(channel, shape) for channel in np.moveaxis(image, -1, 0)]
651  )

Resizes image to specified spatial dimensions using a Lanczos filter.

Arguments:
  • image: Array-like 2D or 3D object, where dtype is uint or floating-point.
  • shape: 2D spatial dimensions (height, width) of output image.
Returns:

A resampled image whose spatial dimensions match shape.

def resize_video(video: Iterable[np.ndarray], shape: tuple[int, int]) -> np.ndarray:
657def resize_video(video: Iterable[_NDArray], shape: tuple[int, int]) -> _NDArray:
658  """Resizes `video` to specified spatial dimensions using a Lanczos filter.
659
660  Args:
661    video: Iterable of images.
662    shape: 2D spatial dimensions (height, width) of output video.
663
664  Returns:
665    A resampled video whose spatial dimensions match `shape`.
666  """
667  _check_2d_shape(shape)
668  return np.array([resize_image(image, shape) for image in video])

Resizes video to specified spatial dimensions using a Lanczos filter.

Arguments:
  • video: Iterable of images.
  • shape: 2D spatial dimensions (height, width) of output video.
Returns:

A resampled video whose spatial dimensions match shape.

def to_rgb( array: ArrayLike, *, vmin: float | None = None, vmax: float | None = None, cmap: str | Callable[[ArrayLike], np.ndarray] = 'gray') -> np.ndarray:
815def to_rgb(
816    array: _ArrayLike,
817    *,
818    vmin: float | None = None,
819    vmax: float | None = None,
820    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
821) -> _NDArray:
822  """Maps scalar values to RGB using value bounds and a color map.
823
824  Args:
825    array: Scalar values, with arbitrary shape.
826    vmin: Explicit min value for remapping; if None, it is obtained as the
827      minimum finite value of `array`.
828    vmax: Explicit max value for remapping; if None, it is obtained as the
829      maximum finite value of `array`.
830    cmap: A `pyplot` color map or callable, to map from 1D value to 3D or 4D
831      color.
832
833  Returns:
834    A new array in which each element is affinely mapped from [vmin, vmax]
835    to [0.0, 1.0] and then color-mapped.
836  """
837  a = _as_valid_media_array(array)
838  del array
839  # For future numpy version 1.7.0:
840  # vmin = np.amin(a, where=np.isfinite(a)) if vmin is None else vmin
841  # vmax = np.amax(a, where=np.isfinite(a)) if vmax is None else vmax
842  vmin = np.amin(np.where(np.isfinite(a), a, np.inf)) if vmin is None else vmin
843  vmax = np.amax(np.where(np.isfinite(a), a, -np.inf)) if vmax is None else vmax
844  a = (a.astype('float') - vmin) / (vmax - vmin + np.finfo(float).eps)
845  if isinstance(cmap, str):
846    if hasattr(matplotlib, 'colormaps'):
847      rgb_from_scalar: Any = matplotlib.colormaps[cmap]  # Newer version.
848    else:
849      rgb_from_scalar = matplotlib.pyplot.cm.get_cmap(cmap)  # pylint: disable=no-member
850  else:
851    rgb_from_scalar = cmap
852  a = cast(_NDArray, rgb_from_scalar(a))
853  # If there is a fully opaque alpha channel, remove it.
854  if a.shape[-1] == 4 and np.all(to_float01(a[..., 3])) == 1.0:
855    a = a[..., :3]
856  return a

Maps scalar values to RGB using value bounds and a color map.

Arguments:
  • array: Scalar values, with arbitrary shape.
  • vmin: Explicit min value for remapping; if None, it is obtained as the minimum finite value of array.
  • vmax: Explicit max value for remapping; if None, it is obtained as the maximum finite value of array.
  • cmap: A pyplot color map or callable, to map from 1D value to 3D or 4D color.
Returns:

A new array in which each element is affinely mapped from [vmin, vmax] to [0.0, 1.0] and then color-mapped.

def to_type(array: ArrayLike, dtype: DTypeLike) -> np.ndarray:
376def to_type(array: _ArrayLike, dtype: _DTypeLike) -> _NDArray:
377  """Returns media array converted to specified type.
378
379  A "media array" is one in which the dtype is either a floating-point type
380  (np.float32 or np.float64) or an unsigned integer type.  The array values are
381  assumed to lie in the range [0.0, 1.0] for floating-point values, and in the
382  full range for unsigned integers, e.g. [0, 255] for np.uint8.
383
384  Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to
385  1.0.  The input array may also be of type bool, whereby True maps to
386  uint(MAX) or 1.0.  The values are scaled and clamped as appropriate during
387  type conversions.
388
389  Args:
390    array: Input array-like object (floating-point, unsigned int, or bool).
391    dtype: Desired output type (floating-point or unsigned int).
392
393  Returns:
394    Array `a` if it is already of the specified dtype, else a converted array.
395  """
396  a = np.asarray(array)
397  dtype = np.dtype(dtype)
398  del array
399  if a.dtype != bool:
400    _as_valid_media_type(a.dtype)  # Verify that 'a' has a valid dtype.
401  if a.dtype == bool:
402    result = a.astype(dtype)
403    if np.issubdtype(dtype, np.unsignedinteger):
404      result = result * dtype.type(np.iinfo(dtype).max)
405  elif a.dtype == dtype:
406    result = a
407  elif np.issubdtype(dtype, np.unsignedinteger):
408    if np.issubdtype(a.dtype, np.unsignedinteger):
409      src_max: float = np.iinfo(a.dtype).max
410    else:
411      a = np.clip(a, 0.0, 1.0)
412      src_max = 1.0
413    dst_max = np.iinfo(dtype).max
414    if dst_max <= np.iinfo(np.uint16).max:
415      scale = np.array(dst_max / src_max, dtype=np.float32)
416      result = (a * scale + 0.5).astype(dtype)
417    elif dst_max <= np.iinfo(np.uint32).max:
418      result = (a.astype(np.float64) * (dst_max / src_max) + 0.5).astype(dtype)
419    else:
420      # https://stackoverflow.com/a/66306123/
421      a = a.astype(np.float64) * (dst_max / src_max) + 0.5
422      dst = np.atleast_1d(a)
423      values_too_large = dst >= np.float64(dst_max)
424      with np.errstate(invalid='ignore'):
425        dst = dst.astype(dtype)
426      dst[values_too_large] = dst_max
427      result = dst if a.ndim > 0 else dst[0]
428  else:
429    assert np.issubdtype(dtype, np.floating)
430    result = a.astype(dtype)
431    if np.issubdtype(a.dtype, np.unsignedinteger):
432      result = result / dtype.type(np.iinfo(a.dtype).max)
433  return result

Returns media array converted to specified type.

A "media array" is one in which the dtype is either a floating-point type (np.float32 or np.float64) or an unsigned integer type. The array values are assumed to lie in the range [0.0, 1.0] for floating-point values, and in the full range for unsigned integers, e.g. [0, 255] for np.uint8.

Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to 1.0. The input array may also be of type bool, whereby True maps to uint(MAX) or 1.0. The values are scaled and clamped as appropriate during type conversions.

Arguments:
  • array: Input array-like object (floating-point, unsigned int, or bool).
  • dtype: Desired output type (floating-point or unsigned int).
Returns:

Array a if it is already of the specified dtype, else a converted array.

def to_float01( a: ArrayLike, dtype: DTypeLike = <class 'numpy.float32'>) -> np.ndarray:
436def to_float01(a: _ArrayLike, dtype: _DTypeLike = np.float32) -> _NDArray:
437  """If array has unsigned integers, rescales them to the range [0.0, 1.0].
438
439  Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0.  See
440  `to_type`.
441
442  Args:
443    a: Input array.
444    dtype: Desired floating-point type if rescaling occurs.
445
446  Returns:
447    A new array of dtype values in the range [0.0, 1.0] if the input array `a`
448    contains unsigned integers; otherwise, array `a` is returned unchanged.
449  """
450  a = np.asarray(a)
451  dtype = np.dtype(dtype)
452  if not np.issubdtype(dtype, np.floating):
453    raise ValueError(f'Type {dtype} is not floating-point.')
454  if np.issubdtype(a.dtype, np.floating):
455    return a
456  return to_type(a, dtype)

If array has unsigned integers, rescales them to the range [0.0, 1.0].

Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0. See to_type.

Arguments:
  • a: Input array.
  • dtype: Desired floating-point type if rescaling occurs.
Returns:

A new array of dtype values in the range [0.0, 1.0] if the input array a contains unsigned integers; otherwise, array a is returned unchanged.

def to_uint8(a: ArrayLike) -> np.ndarray:
459def to_uint8(a: _ArrayLike) -> _NDArray:
460  """Returns array converted to uint8 values; see `to_type`."""
461  return to_type(a, np.uint8)

Returns array converted to uint8 values; see to_type.

def set_output_height(num_pixels: int) -> None:
329def set_output_height(num_pixels: int) -> None:
330  """Overrides the height of the current output cell, if using Colab."""
331  try:
332    # We want to fail gracefully for non-Colab IPython notebooks.
333    output = importlib.import_module('google.colab.output')
334    s = f'google.colab.output.setIframeHeight("{num_pixels}px")'
335    output.eval_js(s)
336  except (ModuleNotFoundError, AttributeError):
337    pass

Overrides the height of the current output cell, if using Colab.

def set_max_output_height(num_pixels: int) -> None:
340def set_max_output_height(num_pixels: int) -> None:
341  """Sets the maximum height of the current output cell, if using Colab."""
342  try:
343    # We want to fail gracefully for non-Colab IPython notebooks.
344    output = importlib.import_module('google.colab.output')
345    s = (
346        'google.colab.output.setIframeHeight('
347        f'0, true, {{maxHeight: {num_pixels}}})'
348    )
349    output.eval_js(s)
350  except (ModuleNotFoundError, AttributeError):
351    pass

Sets the maximum height of the current output cell, if using Colab.

def color_ramp( shape: tuple[int, int] = (64, 64), *, dtype: DTypeLike = <class 'numpy.float32'>) -> np.ndarray:
467def color_ramp(
468    shape: tuple[int, int] = (64, 64), *, dtype: _DTypeLike = np.float32
469) -> _NDArray:
470  """Returns an image of a red-green color gradient.
471
472  This is useful for quick experimentation and testing.  See also
473  `moving_circle` to generate a sample video.
474
475  Args:
476    shape: 2D spatial dimensions (height, width) of generated image.
477    dtype: Type (uint or floating) of resulting pixel values.
478  """
479  _check_2d_shape(shape)
480  dtype = _as_valid_media_type(dtype)
481  yx = (np.moveaxis(np.indices(shape), 0, -1) + 0.5) / shape
482  image = np.insert(yx, 2, 0.0, axis=-1)
483  return to_type(image, dtype)

Returns an image of a red-green color gradient.

This is useful for quick experimentation and testing. See also moving_circle to generate a sample video.

Arguments:
  • shape: 2D spatial dimensions (height, width) of generated image.
  • dtype: Type (uint or floating) of resulting pixel values.
def moving_circle( shape: tuple[int, int] = (256, 256), num_images: int = 10, *, dtype: DTypeLike = <class 'numpy.float32'>) -> np.ndarray:
486def moving_circle(
487    shape: tuple[int, int] = (256, 256),
488    num_images: int = 10,
489    *,
490    dtype: _DTypeLike = np.float32,
491) -> _NDArray:
492  """Returns a video of a circle moving in front of a color ramp.
493
494  This is useful for quick experimentation and testing.  See also `color_ramp`
495  to generate a sample image.
496
497  >>> show_video(moving_circle((480, 640), 60), fps=60)
498
499  Args:
500    shape: 2D spatial dimensions (height, width) of generated video.
501    num_images: Number of video frames.
502    dtype: Type (uint or floating) of resulting pixel values.
503  """
504  _check_2d_shape(shape)
505  dtype = np.dtype(dtype)
506
507  def generate_image(image_index: int) -> _NDArray:
508    """Returns a video frame image."""
509    image = color_ramp(shape, dtype=dtype)
510    yx = np.moveaxis(np.indices(shape), 0, -1)
511    center = shape[0] * 0.6, shape[1] * (image_index + 0.5) / num_images
512    radius_squared = (min(shape) * 0.1) ** 2
513    inside = np.sum((yx - center) ** 2, axis=-1) < radius_squared
514    white_circle_color = 1.0, 1.0, 1.0
515    if np.issubdtype(dtype, np.unsignedinteger):
516      white_circle_color = to_type([white_circle_color], dtype)[0]
517    image[inside] = white_circle_color
518    return image
519
520  return np.array([generate_image(i) for i in range(num_images)])

Returns a video of a circle moving in front of a color ramp.

This is useful for quick experimentation and testing. See also color_ramp to generate a sample image.

>>> show_video(moving_circle((480, 640), 60), fps=60)
Arguments:
  • shape: 2D spatial dimensions (height, width) of generated video.
  • num_images: Number of video frames.
  • dtype: Type (uint or floating) of resulting pixel values.
class set_show_save_dir:
736class set_show_save_dir:  # pylint: disable=invalid-name
737  """Save all titled output from `show_*()` calls into files.
738
739  If the specified `directory` is not None, all titled images and videos
740  displayed by `show_image`, `show_images`, `show_video`, and `show_videos` are
741  also saved as files within the directory.
742
743  It can be used either to set the state or as a context manager:
744
745  >>> set_show_save_dir('/tmp')
746  >>> show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
747  >>> show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
748  >>> set_show_save_dir(None)
749
750  >>> with set_show_save_dir('/tmp'):
751  ...   show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
752  ...   show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
753  """
754
755  def __init__(self, directory: _Path | None):
756    self._old_show_save_dir = _config.show_save_dir
757    _config.show_save_dir = directory
758
759  def __enter__(self) -> None:
760    pass
761
762  def __exit__(self, *_: Any) -> None:
763    _config.show_save_dir = self._old_show_save_dir

Save all titled output from show_*() calls into files.

If the specified directory is not None, all titled images and videos displayed by show_image, show_images, show_video, and show_videos are also saved as files within the directory.

It can be used either to set the state or as a context manager:

>>> set_show_save_dir('/tmp')
>>> show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
>>> show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
>>> set_show_save_dir(None)
>>> with set_show_save_dir('/tmp'):
...   show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
...   show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
set_show_save_dir(directory: str | os.PathLike[str] | None)
755  def __init__(self, directory: _Path | None):
756    self._old_show_save_dir = _config.show_save_dir
757    _config.show_save_dir = directory
def set_ffmpeg(name_or_path: str | os.PathLike[str]) -> None:
315def set_ffmpeg(name_or_path: _Path) -> None:
316  """Specifies the name or path for the `ffmpeg` external program.
317
318  The `ffmpeg` program is required for compressing and decompressing video.
319  (It is used in `read_video`, `write_video`, `show_video`, `show_videos`,
320  etc.)
321
322  Args:
323    name_or_path: Either a filename within a directory of `os.environ['PATH']`
324      or a filepath.  The default setting is 'ffmpeg'.
325  """
326  _config.ffmpeg_name_or_path = name_or_path

Specifies the name or path for the ffmpeg external program.

The ffmpeg program is required for compressing and decompressing video. (It is used in read_video, write_video, show_video, show_videos, etc.)

Arguments:
  • name_or_path: Either a filename within a directory of os.environ['PATH'] or a filepath. The default setting is 'ffmpeg'.
def video_is_available() -> bool:
1266def video_is_available() -> bool:
1267  """Returns True if the program `ffmpeg` is found.
1268
1269  See also `set_ffmpeg`.
1270  """
1271  return _search_for_ffmpeg_path() is not None

Returns True if the program ffmpeg is found.

See also set_ffmpeg.