mediapy

mediapy: Read/write/show images and videos in an IPython/Jupyter notebook.

[GitHub source]   [API docs]   [PyPI package]   [Colab example]

See the example notebook, or better yet, open it in Colab.

Image examples

Display an image (2D or 3D numpy array):

checkerboard = np.kron([[0, 1] * 16, [1, 0] * 16] * 16, np.ones((4, 4)))
show_image(checkerboard)

Read and display an image (either local or from the Web):

IMAGE = 'https://github.com/hhoppe/data/raw/main/image.png'
show_image(read_image(IMAGE))

Read and display an image from a local file:

!wget -q -O /tmp/burano.png {IMAGE}
show_image(read_image('/tmp/burano.png'))

Show titled images side-by-side:

images = {
    'original': checkerboard,
    'darkened': checkerboard * 0.7,
    'random': np.random.rand(32, 32, 3),
}
show_images(images, vmin=0.0, vmax=1.0, border=True, height=64)

Compare two images using an interactive slider:

compare_images([checkerboard, np.random.rand(128, 128, 3)])

Video examples

Display a video (an iterable of images, e.g., a 3D or 4D array):

video = moving_circle((100, 100), num_images=10)
show_video(video, fps=10)

Show the video frames side-by-side:

show_images(video, columns=6, border=True, height=64)

Show the frames with their indices:

show_images({f'{i}': image for i, image in enumerate(video)}, width=32)

Read and display a video (either local or from the Web):

VIDEO = 'https://github.com/hhoppe/data/raw/main/video.mp4'
show_video(read_video(VIDEO))

Create and display a looping two-frame GIF video:

image1 = resize_image(np.random.rand(10, 10, 3), (50, 50))
show_video([image1, image1 * 0.8], fps=2, codec='gif')

Darken a video frame-by-frame:

output_path = '/tmp/out.mp4'
with VideoReader(VIDEO) as r:
  darken_image = lambda image: to_float01(image) * 0.5
  with VideoWriter(output_path, shape=r.shape, fps=r.fps, bps=r.bps) as w:
    for image in r:
      w.add_image(darken_image(image))
   1# Copyright 2025 The mediapy Authors.
   2#
   3# Licensed under the Apache License, Version 2.0 (the "License");
   4# you may not use this file except in compliance with the License.
   5# You may obtain a copy of the License at
   6#
   7#     http://www.apache.org/licenses/LICENSE-2.0
   8#
   9# Unless required by applicable law or agreed to in writing, software
  10# distributed under the License is distributed on an "AS IS" BASIS,
  11# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  12# See the License for the specific language governing permissions and
  13# limitations under the License.
  14
  15"""`mediapy`: Read/write/show images and videos in an IPython/Jupyter notebook.
  16
  17[**[GitHub source]**](https://github.com/google/mediapy)  
  18[**[API docs]**](https://google.github.io/mediapy/)  
  19[**[PyPI package]**](https://pypi.org/project/mediapy/)  
  20[**[Colab
  21example]**](https://colab.research.google.com/github/google/mediapy/blob/main/mediapy_examples.ipynb)
  22
  23See the [example
  24notebook](https://github.com/google/mediapy/blob/main/mediapy_examples.ipynb),
  25or better yet, [**open it in
  26Colab**](https://colab.research.google.com/github/google/mediapy/blob/main/mediapy_examples.ipynb).
  27
  28## Image examples
  29
  30Display an image (2D or 3D `numpy` array):
  31```python
  32checkerboard = np.kron([[0, 1] * 16, [1, 0] * 16] * 16, np.ones((4, 4)))
  33show_image(checkerboard)
  34```
  35
  36Read and display an image (either local or from the Web):
  37```python
  38IMAGE = 'https://github.com/hhoppe/data/raw/main/image.png'
  39show_image(read_image(IMAGE))
  40```
  41
  42Read and display an image from a local file:
  43```python
  44!wget -q -O /tmp/burano.png {IMAGE}
  45show_image(read_image('/tmp/burano.png'))
  46```
  47
  48Show titled images side-by-side:
  49```python
  50images = {
  51    'original': checkerboard,
  52    'darkened': checkerboard * 0.7,
  53    'random': np.random.rand(32, 32, 3),
  54}
  55show_images(images, vmin=0.0, vmax=1.0, border=True, height=64)
  56```
  57
  58Compare two images using an interactive slider:
  59```python
  60compare_images([checkerboard, np.random.rand(128, 128, 3)])
  61```
  62
  63## Video examples
  64
  65Display a video (an iterable of images, e.g., a 3D or 4D array):
  66```python
  67video = moving_circle((100, 100), num_images=10)
  68show_video(video, fps=10)
  69```
  70
  71Show the video frames side-by-side:
  72```python
  73show_images(video, columns=6, border=True, height=64)
  74```
  75
  76Show the frames with their indices:
  77```python
  78show_images({f'{i}': image for i, image in enumerate(video)}, width=32)
  79```
  80
  81Read and display a video (either local or from the Web):
  82```python
  83VIDEO = 'https://github.com/hhoppe/data/raw/main/video.mp4'
  84show_video(read_video(VIDEO))
  85```
  86
  87Create and display a looping two-frame GIF video:
  88```python
  89image1 = resize_image(np.random.rand(10, 10, 3), (50, 50))
  90show_video([image1, image1 * 0.8], fps=2, codec='gif')
  91```
  92
  93Darken a video frame-by-frame:
  94```python
  95output_path = '/tmp/out.mp4'
  96with VideoReader(VIDEO) as r:
  97  darken_image = lambda image: to_float01(image) * 0.5
  98  with VideoWriter(output_path, shape=r.shape, fps=r.fps, bps=r.bps) as w:
  99    for image in r:
 100      w.add_image(darken_image(image))
 101```
 102"""
 103
 104from __future__ import annotations
 105
 106__docformat__ = 'google'
 107__version__ = '1.2.5'
 108__version_info__ = tuple(int(num) for num in __version__.split('.'))
 109
 110import base64
 111from collections.abc import Callable, Iterable, Iterator, Mapping, Sequence
 112import contextlib
 113import functools
 114import importlib
 115import io
 116import itertools
 117import math
 118import numbers
 119import os  # Package only needed for typing.TYPE_CHECKING.
 120import pathlib
 121import re
 122import shlex
 123import shutil
 124import subprocess
 125import sys
 126import tempfile
 127import typing
 128from typing import Any
 129import urllib.request
 130import warnings
 131
 132import IPython.display
 133import matplotlib.pyplot
 134import numpy as np
 135import numpy.typing as npt
 136import PIL.Image
 137import PIL.ImageOps
 138
 139
 140if not hasattr(PIL.Image, 'Resampling'):  # Allow Pillow<9.0.
 141  PIL.Image.Resampling = PIL.Image  # type: ignore
 142
 143# Selected and reordered here for pdoc documentation.
 144__all__ = [
 145    'show_image',
 146    'show_images',
 147    'compare_images',
 148    'show_video',
 149    'show_videos',
 150    'read_image',
 151    'write_image',
 152    'read_video',
 153    'write_video',
 154    'VideoReader',
 155    'VideoWriter',
 156    'VideoMetadata',
 157    'compress_image',
 158    'decompress_image',
 159    'compress_video',
 160    'decompress_video',
 161    'html_from_compressed_image',
 162    'html_from_compressed_video',
 163    'resize_image',
 164    'resize_video',
 165    'to_rgb',
 166    'to_type',
 167    'to_float01',
 168    'to_uint8',
 169    'set_output_height',
 170    'set_max_output_height',
 171    'color_ramp',
 172    'moving_circle',
 173    'set_show_save_dir',
 174    'set_ffmpeg',
 175    'video_is_available',
 176]
 177
 178if TYPE_CHECKING:
 179  _ArrayLike = npt.ArrayLike
 180  _DTypeLike = npt.DTypeLike
 181  _NDArray = npt.NDArray[Any]
 182  _DType = np.dtype[Any]
 183else:
 184  # Create named types for use in the `pdoc` documentation.
 185  _ArrayLike = TypeVar('_ArrayLike')
 186  _DTypeLike = TypeVar('_DTypeLike')
 187  _NDArray = TypeVar('_NDArray')
 188  _DType = TypeVar('_DType')  # pylint: disable=invalid-name
 189
 190_IPYTHON_HTML_SIZE_LIMIT = 10**10  # Unlimited seems to be OK now.
 191_T = TypeVar('_T')
 192_Path = Union[str, 'os.PathLike[str]']
 193
 194_IMAGE_COMPARISON_HTML = """\
 195<script
 196  defer
 197  src="https://unpkg.com/img-comparison-slider@7/dist/index.js"
 198></script>
 199<link
 200  rel="stylesheet"
 201  href="https://unpkg.com/img-comparison-slider@7/dist/styles.css"
 202/>
 203
 204<img-comparison-slider>
 205  <img slot="first" src="data:image/png;base64,{b64_1}" />
 206  <img slot="second" src="data:image/png;base64,{b64_2}" />
 207</img-comparison-slider>
 208"""
 209
 210# ** Miscellaneous.
 211
 212
 213class _Config:
 214  ffmpeg_name_or_path: _Path = 'ffmpeg'
 215  show_save_dir: _Path | None = None
 216
 217
 218_config = _Config()
 219
 220
 221def _open(path: _Path, *args: Any, **kwargs: Any) -> Any:
 222  """Opens the file; this is a hook for the built-in `open()`."""
 223  return open(path, *args, **kwargs)
 224
 225
 226def _path_is_local(path: _Path) -> bool:
 227  """Returns True if the path is in the filesystem accessible by `ffmpeg`."""
 228  del path
 229  return True
 230
 231
 232def _search_for_ffmpeg_path() -> str | None:
 233  """Returns a path to the ffmpeg program, or None if not found."""
 234  if filename := shutil.which(_config.ffmpeg_name_or_path):
 235    return str(filename)
 236  return None
 237
 238
 239def _print_err(*args: str, **kwargs: Any) -> None:
 240  """Prints arguments to stderr immediately."""
 241  kwargs = {**dict(file=sys.stderr, flush=True), **kwargs}
 242  print(*args, **kwargs)
 243
 244
 245def _chunked(
 246    iterable: Iterable[_T], n: int | None = None
 247) -> Iterator[tuple[_T, ...]]:
 248  """Returns elements collected as tuples of length at most `n` if not None."""
 249
 250  def take(n: int | None, iterable: Iterable[_T]) -> tuple[_T, ...]:
 251    return tuple(itertools.islice(iterable, n))
 252
 253  return iter(functools.partial(take, n, iter(iterable)), ())
 254
 255
 256def _peek_first(iterator: Iterable[_T]) -> tuple[_T, Iterable[_T]]:
 257  """Given an iterator, returns first element and re-initialized iterator.
 258
 259  >>> first_image, images = _peek_first(moving_circle())
 260
 261  Args:
 262    iterator: An input iterator or iterable.
 263
 264  Returns:
 265    A tuple (first_element, iterator_reinitialized) containing:
 266      first_element: The first element of the input.
 267      iterator_reinitialized: A clone of the original iterator/iterable.
 268  """
 269  # Inspired from https://stackoverflow.com/a/12059829/1190077
 270  peeker, iterator_reinitialized = itertools.tee(iterator)
 271  first = next(peeker)
 272  return first, iterator_reinitialized
 273
 274
 275def _check_2d_shape(shape: tuple[int, int]) -> None:
 276  """Checks that `shape` is of the form (height, width) with two integers."""
 277  if len(shape) != 2:
 278    raise ValueError(f'Shape {shape} is not of the form (height, width).')
 279  if not all(isinstance(i, numbers.Integral) for i in shape):
 280    raise ValueError(f'Shape {shape} contains non-integers.')
 281
 282
 283def _run(args: str | Sequence[str]) -> None:
 284  """Executes command, printing output from stdout and stderr.
 285
 286  Args:
 287    args: Command to execute, which can be either a string or a sequence of word
 288      strings, as in `subprocess.run()`.  If `args` is a string, the shell is
 289      invoked to interpret it.
 290
 291  Raises:
 292    RuntimeError: If the command's exit code is nonzero.
 293  """
 294  proc = subprocess.run(
 295      args,
 296      shell=isinstance(args, str),
 297      stdout=subprocess.PIPE,
 298      stderr=subprocess.STDOUT,
 299      check=False,
 300      universal_newlines=True,
 301  )
 302  print(proc.stdout, end='', flush=True)
 303  if proc.returncode:
 304    raise RuntimeError(
 305        f"Command '{proc.args}' failed with code {proc.returncode}."
 306    )
 307
 308
 309def _display_html(text: str, /) -> None:
 310  """In a Jupyter notebook, display the HTML `text`."""
 311  IPython.display.display(IPython.display.HTML(text))  # type: ignore
 312
 313
 314def set_ffmpeg(name_or_path: _Path) -> None:
 315  """Specifies the name or path for the `ffmpeg` external program.
 316
 317  The `ffmpeg` program is required for compressing and decompressing video.
 318  (It is used in `read_video`, `write_video`, `show_video`, `show_videos`,
 319  etc.)
 320
 321  Args:
 322    name_or_path: Either a filename within a directory of `os.environ['PATH']`
 323      or a filepath.  The default setting is 'ffmpeg'.
 324  """
 325  _config.ffmpeg_name_or_path = name_or_path
 326
 327
 328def set_output_height(num_pixels: int) -> None:
 329  """Overrides the height of the current output cell, if using Colab."""
 330  try:
 331    # We want to fail gracefully for non-Colab IPython notebooks.
 332    output = importlib.import_module('google.colab.output')
 333    s = f'google.colab.output.setIframeHeight("{num_pixels}px")'
 334    output.eval_js(s)
 335  except (ModuleNotFoundError, AttributeError):
 336    pass
 337
 338
 339def set_max_output_height(num_pixels: int) -> None:
 340  """Sets the maximum height of the current output cell, if using Colab."""
 341  try:
 342    # We want to fail gracefully for non-Colab IPython notebooks.
 343    output = importlib.import_module('google.colab.output')
 344    s = (
 345        'google.colab.output.setIframeHeight('
 346        f'0, true, {{maxHeight: {num_pixels}}})'
 347    )
 348    output.eval_js(s)
 349  except (ModuleNotFoundError, AttributeError):
 350    pass
 351
 352
 353# ** Type conversions.
 354
 355
 356def _as_valid_media_type(dtype: _DTypeLike) -> _DType:
 357  """Returns validated media data type."""
 358  dtype = np.dtype(dtype)
 359  if not issubclass(dtype.type, (np.unsignedinteger, np.floating)):
 360    raise ValueError(
 361        f'Type {dtype} is not a valid media data type (uint or float).'
 362    )
 363  return dtype
 364
 365
 366def _as_valid_media_array(x: _ArrayLike) -> _NDArray:
 367  """Converts to ndarray (if not already), and checks validity of data type."""
 368  a = np.asarray(x)
 369  if a.dtype == bool:
 370    a = a.astype(np.uint8) * np.iinfo(np.uint8).max
 371  _as_valid_media_type(a.dtype)
 372  return a
 373
 374
 375def to_type(array: _ArrayLike, dtype: _DTypeLike) -> _NDArray:
 376  """Returns media array converted to specified type.
 377
 378  A "media array" is one in which the dtype is either a floating-point type
 379  (np.float32 or np.float64) or an unsigned integer type.  The array values are
 380  assumed to lie in the range [0.0, 1.0] for floating-point values, and in the
 381  full range for unsigned integers, e.g. [0, 255] for np.uint8.
 382
 383  Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to
 384  1.0.  The input array may also be of type bool, whereby True maps to
 385  uint(MAX) or 1.0.  The values are scaled and clamped as appropriate during
 386  type conversions.
 387
 388  Args:
 389    array: Input array-like object (floating-point, unsigned int, or bool).
 390    dtype: Desired output type (floating-point or unsigned int).
 391
 392  Returns:
 393    Array `a` if it is already of the specified dtype, else a converted array.
 394  """
 395  a = np.asarray(array)
 396  dtype = np.dtype(dtype)
 397  del array
 398  if a.dtype != bool:
 399    _as_valid_media_type(a.dtype)  # Verify that 'a' has a valid dtype.
 400  if a.dtype == bool:
 401    result = a.astype(dtype)
 402    if np.issubdtype(dtype, np.unsignedinteger):
 403      result = result * dtype.type(np.iinfo(dtype).max)
 404  elif a.dtype == dtype:
 405    result = a
 406  elif np.issubdtype(dtype, np.unsignedinteger):
 407    if np.issubdtype(a.dtype, np.unsignedinteger):
 408      src_max: float = np.iinfo(a.dtype).max
 409    else:
 410      a = np.clip(a, 0.0, 1.0)
 411      src_max = 1.0
 412    dst_max = np.iinfo(dtype).max
 413    if dst_max <= np.iinfo(np.uint16).max:
 414      scale = np.array(dst_max / src_max, dtype=np.float32)
 415      result = (a * scale + 0.5).astype(dtype)
 416    elif dst_max <= np.iinfo(np.uint32).max:
 417      result = (a.astype(np.float64) * (dst_max / src_max) + 0.5).astype(dtype)
 418    else:
 419      # https://stackoverflow.com/a/66306123/
 420      a = a.astype(np.float64) * (dst_max / src_max) + 0.5
 421      dst = np.atleast_1d(a)
 422      values_too_large = dst >= np.float64(dst_max)
 423      with np.errstate(invalid='ignore'):
 424        dst = dst.astype(dtype)
 425      dst[values_too_large] = dst_max
 426      result = dst if a.ndim > 0 else dst[0]
 427  else:
 428    assert np.issubdtype(dtype, np.floating)
 429    result = a.astype(dtype)
 430    if np.issubdtype(a.dtype, np.unsignedinteger):
 431      result = result / dtype.type(np.iinfo(a.dtype).max)
 432  return result
 433
 434
 435def to_float01(a: _ArrayLike, dtype: _DTypeLike = np.float32) -> _NDArray:
 436  """If array has unsigned integers, rescales them to the range [0.0, 1.0].
 437
 438  Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0.  See
 439  `to_type`.
 440
 441  Args:
 442    a: Input array.
 443    dtype: Desired floating-point type if rescaling occurs.
 444
 445  Returns:
 446    A new array of dtype values in the range [0.0, 1.0] if the input array `a`
 447    contains unsigned integers; otherwise, array `a` is returned unchanged.
 448  """
 449  a = np.asarray(a)
 450  dtype = np.dtype(dtype)
 451  if not np.issubdtype(dtype, np.floating):
 452    raise ValueError(f'Type {dtype} is not floating-point.')
 453  if np.issubdtype(a.dtype, np.floating):
 454    return a
 455  return to_type(a, dtype)
 456
 457
 458def to_uint8(a: _ArrayLike) -> _NDArray:
 459  """Returns array converted to uint8 values; see `to_type`."""
 460  return to_type(a, np.uint8)
 461
 462
 463# ** Functions to generate example image and video data.
 464
 465
 466def color_ramp(
 467    shape: tuple[int, int] = (64, 64), *, dtype: _DTypeLike = np.float32
 468) -> _NDArray:
 469  """Returns an image of a red-green color gradient.
 470
 471  This is useful for quick experimentation and testing.  See also
 472  `moving_circle` to generate a sample video.
 473
 474  Args:
 475    shape: 2D spatial dimensions (height, width) of generated image.
 476    dtype: Type (uint or floating) of resulting pixel values.
 477  """
 478  _check_2d_shape(shape)
 479  dtype = _as_valid_media_type(dtype)
 480  yx = (np.moveaxis(np.indices(shape), 0, -1) + 0.5) / shape
 481  image = np.insert(yx, 2, 0.0, axis=-1)
 482  return to_type(image, dtype)
 483
 484
 485def moving_circle(
 486    shape: tuple[int, int] = (256, 256),
 487    num_images: int = 10,
 488    *,
 489    dtype: _DTypeLike = np.float32,
 490) -> _NDArray:
 491  """Returns a video of a circle moving in front of a color ramp.
 492
 493  This is useful for quick experimentation and testing.  See also `color_ramp`
 494  to generate a sample image.
 495
 496  >>> show_video(moving_circle((480, 640), 60), fps=60)
 497
 498  Args:
 499    shape: 2D spatial dimensions (height, width) of generated video.
 500    num_images: Number of video frames.
 501    dtype: Type (uint or floating) of resulting pixel values.
 502  """
 503  _check_2d_shape(shape)
 504  dtype = np.dtype(dtype)
 505
 506  def generate_image(image_index: int) -> _NDArray:
 507    """Returns a video frame image."""
 508    image = color_ramp(shape, dtype=dtype)
 509    yx = np.moveaxis(np.indices(shape), 0, -1)
 510    center = shape[0] * 0.6, shape[1] * (image_index + 0.5) / num_images
 511    radius_squared = (min(shape) * 0.1) ** 2
 512    inside = np.sum((yx - center) ** 2, axis=-1) < radius_squared
 513    white_circle_color = 1.0, 1.0, 1.0
 514    if np.issubdtype(dtype, np.unsignedinteger):
 515      white_circle_color = to_type([white_circle_color], dtype)[0]
 516    image[inside] = white_circle_color
 517    return image
 518
 519  return np.array([generate_image(i) for i in range(num_images)])
 520
 521
 522# ** Color-space conversions.
 523
 524# Same matrix values as in two sources:
 525# https://github.com/scikit-image/scikit-image/blob/master/skimage/color/colorconv.py#L377
 526# https://github.com/tensorflow/tensorflow/blob/r1.14/tensorflow/python/ops/image_ops_impl.py#L2754
 527_YUV_FROM_RGB_MATRIX = np.array(
 528    [
 529        [0.299, -0.14714119, 0.61497538],
 530        [0.587, -0.28886916, -0.51496512],
 531        [0.114, 0.43601035, -0.10001026],
 532    ],
 533    dtype=np.float32,
 534)
 535_RGB_FROM_YUV_MATRIX = np.linalg.inv(_YUV_FROM_RGB_MATRIX)
 536_YUV_CHROMA_OFFSET = np.array([0.0, 0.5, 0.5], dtype=np.float32)
 537
 538
 539def yuv_from_rgb(rgb: _ArrayLike) -> _NDArray:
 540  """Returns the RGB image/video mapped to YUV [0,1] color space.
 541
 542  Note that the "YUV" color space used by video compressors is actually YCbCr!
 543
 544  Args:
 545    rgb: Input image in sRGB space.
 546  """
 547  rgb = to_float01(rgb)
 548  if rgb.shape[-1] != 3:
 549    raise ValueError(f'The last dimension in {rgb.shape} is not 3.')
 550  return rgb @ _YUV_FROM_RGB_MATRIX + _YUV_CHROMA_OFFSET
 551
 552
 553def rgb_from_yuv(yuv: _ArrayLike) -> _NDArray:
 554  """Returns the YUV image/video mapped to RGB [0,1] color space."""
 555  yuv = to_float01(yuv)
 556  if yuv.shape[-1] != 3:
 557    raise ValueError(f'The last dimension in {yuv.shape} is not 3.')
 558  return (yuv - _YUV_CHROMA_OFFSET) @ _RGB_FROM_YUV_MATRIX
 559
 560
 561# Same matrix values as in
 562# https://github.com/scikit-image/scikit-image/blob/master/skimage/color/colorconv.py#L1654
 563# and https://en.wikipedia.org/wiki/YUV#Studio_swing_for_BT.601
 564_YCBCR_FROM_RGB_MATRIX = np.array(
 565    [
 566        [65.481, 128.553, 24.966],
 567        [-37.797, -74.203, 112.0],
 568        [112.0, -93.786, -18.214],
 569    ],
 570    dtype=np.float32,
 571).transpose()
 572_RGB_FROM_YCBCR_MATRIX = np.linalg.inv(_YCBCR_FROM_RGB_MATRIX)
 573_YCBCR_OFFSET = np.array([16.0, 128.0, 128.0], dtype=np.float32)
 574# Note that _YCBCR_FROM_RGB_MATRIX =~ _YUV_FROM_RGB_MATRIX * [219, 256, 182];
 575# https://en.wikipedia.org/wiki/YUV: "Y' values are conventionally shifted and
 576# scaled to the range [16, 235] (referred to as studio swing or 'TV levels')";
 577# "studio range of 16-240 for U and V".  (Where does value 182 come from?)
 578
 579
 580def ycbcr_from_rgb(rgb: _ArrayLike) -> _NDArray:
 581  """Returns the RGB image/video mapped to YCbCr [0,1] color space.
 582
 583  The YCbCr color space is the one called "YUV" by video compressors.
 584
 585  Args:
 586    rgb: Input image in sRGB space.
 587  """
 588  rgb = to_float01(rgb)
 589  if rgb.shape[-1] != 3:
 590    raise ValueError(f'The last dimension in {rgb.shape} is not 3.')
 591  return (rgb @ _YCBCR_FROM_RGB_MATRIX + _YCBCR_OFFSET) / 255.0
 592
 593
 594def rgb_from_ycbcr(ycbcr: _ArrayLike) -> _NDArray:
 595  """Returns the YCbCr image/video mapped to RGB [0,1] color space."""
 596  ycbcr = to_float01(ycbcr)
 597  if ycbcr.shape[-1] != 3:
 598    raise ValueError(f'The last dimension in {ycbcr.shape} is not 3.')
 599  return (ycbcr * 255.0 - _YCBCR_OFFSET) @ _RGB_FROM_YCBCR_MATRIX
 600
 601
 602# ** Image processing.
 603
 604
 605def _pil_image(image: _ArrayLike, mode: str | None = None) -> PIL.Image.Image:
 606  """Returns a PIL image given a numpy matrix (either uint8 or float [0,1])."""
 607  image = _as_valid_media_array(image)
 608  if image.ndim not in (2, 3):
 609    raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.')
 610  pil_image: PIL.Image.Image = PIL.Image.fromarray(image, mode=mode)
 611  return pil_image
 612
 613
 614def resize_image(image: _ArrayLike, shape: tuple[int, int]) -> _NDArray:
 615  """Resizes image to specified spatial dimensions using a Lanczos filter.
 616
 617  Args:
 618    image: Array-like 2D or 3D object, where dtype is uint or floating-point.
 619    shape: 2D spatial dimensions (height, width) of output image.
 620
 621  Returns:
 622    A resampled image whose spatial dimensions match `shape`.
 623  """
 624  image = _as_valid_media_array(image)
 625  if image.ndim not in (2, 3):
 626    raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.')
 627  _check_2d_shape(shape)
 628
 629  # A PIL image can be multichannel only if it has 3 or 4 uint8 channels,
 630  # and it can be resized only if it is uint8 or float32.
 631  supported_single_channel = (
 632      np.issubdtype(image.dtype, np.floating) or image.dtype == np.uint8
 633  ) and image.ndim == 2
 634  supported_multichannel = (
 635      image.dtype == np.uint8 and image.ndim == 3 and image.shape[2] in (3, 4)
 636  )
 637  if supported_single_channel or supported_multichannel:
 638    return np.array(
 639        _pil_image(image).resize(
 640            shape[::-1], resample=PIL.Image.Resampling.LANCZOS
 641        ),
 642        dtype=image.dtype,
 643    )
 644  if image.ndim == 2:
 645    # We convert to floating-point for resizing and convert back.
 646    return to_type(resize_image(to_float01(image), shape), image.dtype)
 647  # We resize each image channel individually.
 648  return np.dstack(
 649      [resize_image(channel, shape) for channel in np.moveaxis(image, -1, 0)]
 650  )
 651
 652
 653# ** Video processing.
 654
 655
 656def resize_video(video: Iterable[_NDArray], shape: tuple[int, int]) -> _NDArray:
 657  """Resizes `video` to specified spatial dimensions using a Lanczos filter.
 658
 659  Args:
 660    video: Iterable of images.
 661    shape: 2D spatial dimensions (height, width) of output video.
 662
 663  Returns:
 664    A resampled video whose spatial dimensions match `shape`.
 665  """
 666  _check_2d_shape(shape)
 667  return np.array([resize_image(image, shape) for image in video])
 668
 669
 670# ** General I/O.
 671
 672
 673def _is_url(path_or_url: _Path) -> bool:
 674  return isinstance(path_or_url, str) and path_or_url.startswith(
 675      ('http://', 'https://', 'file://')
 676  )
 677
 678
 679def read_contents(path_or_url: _Path) -> bytes:
 680  """Returns the contents of the file specified by either a path or URL."""
 681  data: bytes
 682  if _is_url(path_or_url):
 683    assert isinstance(path_or_url, str)
 684    headers = {'User-Agent': 'Chrome'}
 685    request = urllib.request.Request(path_or_url, headers=headers)
 686    with urllib.request.urlopen(request) as response:
 687      data = response.read()
 688  else:
 689    with _open(path_or_url, 'rb') as f:
 690      data = f.read()
 691  return data
 692
 693
 694@contextlib.contextmanager
 695def _read_via_local_file(path_or_url: _Path) -> Iterator[str]:
 696  """Context to copy a remote file locally to read from it.
 697
 698  Args:
 699    path_or_url: File, which may be remote.
 700
 701  Yields:
 702    The name of a local file which may be a copy of a remote file.
 703  """
 704  if _is_url(path_or_url) or not _path_is_local(path_or_url):
 705    suffix = pathlib.Path(path_or_url).suffix
 706    with tempfile.TemporaryDirectory() as directory_name:
 707      tmp_path = pathlib.Path(directory_name) / f'file{suffix}'
 708      tmp_path.write_bytes(read_contents(path_or_url))
 709      yield str(tmp_path)
 710  else:
 711    yield str(path_or_url)
 712
 713
 714@contextlib.contextmanager
 715def _write_via_local_file(path: _Path) -> Iterator[str]:
 716  """Context to write a temporary local file and subsequently copy it remotely.
 717
 718  Args:
 719    path: File, which may be remote.
 720
 721  Yields:
 722    The name of a local file which may be subsequently copied remotely.
 723  """
 724  if _path_is_local(path):
 725    yield str(path)
 726  else:
 727    suffix = pathlib.Path(path).suffix
 728    with tempfile.TemporaryDirectory() as directory_name:
 729      tmp_path = pathlib.Path(directory_name) / f'file{suffix}'
 730      yield str(tmp_path)
 731      with _open(path, mode='wb') as f:
 732        f.write(tmp_path.read_bytes())
 733
 734
 735class set_show_save_dir:  # pylint: disable=invalid-name
 736  """Save all titled output from `show_*()` calls into files.
 737
 738  If the specified `directory` is not None, all titled images and videos
 739  displayed by `show_image`, `show_images`, `show_video`, and `show_videos` are
 740  also saved as files within the directory.
 741
 742  It can be used either to set the state or as a context manager:
 743
 744  >>> set_show_save_dir('/tmp')
 745  >>> show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
 746  >>> show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
 747  >>> set_show_save_dir(None)
 748
 749  >>> with set_show_save_dir('/tmp'):
 750  ...   show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
 751  ...   show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
 752  """
 753
 754  def __init__(self, directory: _Path | None):
 755    self._old_show_save_dir = _config.show_save_dir
 756    _config.show_save_dir = directory
 757
 758  def __enter__(self) -> None:
 759    pass
 760
 761  def __exit__(self, *_: Any) -> None:
 762    _config.show_save_dir = self._old_show_save_dir
 763
 764
 765# ** Image I/O.
 766
 767
 768def read_image(
 769    path_or_url: _Path,
 770    *,
 771    apply_exif_transpose: bool = True,
 772    dtype: _DTypeLike = None,
 773) -> _NDArray:
 774  """Returns an image read from a file path or URL.
 775
 776  Decoding is performed using `PIL`, which supports `uint8` images with 1, 3,
 777  or 4 channels and `uint16` images with a single channel.
 778
 779  Args:
 780    path_or_url: Path of input file.
 781    apply_exif_transpose: If True, rotate image according to EXIF orientation.
 782    dtype: Data type of the returned array.  If None, `np.uint8` or `np.uint16`
 783      is inferred automatically.
 784  """
 785  data = read_contents(path_or_url)
 786  return decompress_image(data, dtype, apply_exif_transpose)
 787
 788
 789def write_image(
 790    path: _Path, image: _ArrayLike, fmt: str = 'png', **kwargs: Any
 791) -> None:
 792  """Writes an image to a file.
 793
 794  Encoding is performed using `PIL`, which supports `uint8` images with 1, 3,
 795  or 4 channels and `uint16` images with a single channel.
 796
 797  File format is explicitly provided by `fmt` and not inferred by `path`.
 798
 799  Args:
 800    path: Path of output file.
 801    image: Array-like object.  If its type is float, it is converted to np.uint8
 802      using `to_uint8` (thus clamping to the input to the range [0.0, 1.0]).
 803      Otherwise it must be np.uint8 or np.uint16.
 804    fmt: Desired compression encoding, e.g. 'png'.
 805    **kwargs: Additional parameters for `PIL.Image.save()`.
 806  """
 807  image = _as_valid_media_array(image)
 808  if np.issubdtype(image.dtype, np.floating):
 809    image = to_uint8(image)
 810  with _open(path, 'wb') as f:
 811    _pil_image(image).save(f, format=fmt, **kwargs)
 812
 813
 814def to_rgb(
 815    array: _ArrayLike,
 816    *,
 817    vmin: float | None = None,
 818    vmax: float | None = None,
 819    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
 820) -> _NDArray:
 821  """Maps scalar values to RGB using value bounds and a color map.
 822
 823  Args:
 824    array: Scalar values, with arbitrary shape.
 825    vmin: Explicit min value for remapping; if None, it is obtained as the
 826      minimum finite value of `array`.
 827    vmax: Explicit max value for remapping; if None, it is obtained as the
 828      maximum finite value of `array`.
 829    cmap: A `pyplot` color map or callable, to map from 1D value to 3D or 4D
 830      color.
 831
 832  Returns:
 833    A new array in which each element is affinely mapped from [vmin, vmax]
 834    to [0.0, 1.0] and then color-mapped.
 835  """
 836  a = _as_valid_media_array(array)
 837  del array
 838  # For future numpy version 1.7.0:
 839  # vmin = np.amin(a, where=np.isfinite(a)) if vmin is None else vmin
 840  # vmax = np.amax(a, where=np.isfinite(a)) if vmax is None else vmax
 841  vmin = np.amin(np.where(np.isfinite(a), a, np.inf)) if vmin is None else vmin
 842  vmax = np.amax(np.where(np.isfinite(a), a, -np.inf)) if vmax is None else vmax
 843  a = (a.astype('float') - vmin) / (vmax - vmin + np.finfo(float).eps)
 844  if isinstance(cmap, str):
 845    if hasattr(matplotlib, 'colormaps'):
 846      rgb_from_scalar: Any = matplotlib.colormaps[cmap]  # Newer version.
 847    else:
 848      rgb_from_scalar = matplotlib.pyplot.cm.get_cmap(cmap)  # pylint: disable=no-member
 849  else:
 850    rgb_from_scalar = cmap
 851  a = cast(_NDArray, rgb_from_scalar(a))
 852  # If there is a fully opaque alpha channel, remove it.
 853  if a.shape[-1] == 4 and np.all(to_float01(a[..., 3])) == 1.0:
 854    a = a[..., :3]
 855  return a
 856
 857
 858def compress_image(
 859    image: _ArrayLike, *, fmt: str = 'png', **kwargs: Any
 860) -> bytes:
 861  """Returns a buffer containing a compressed image.
 862
 863  Args:
 864    image: Array in a format supported by `PIL`, e.g. np.uint8 or np.uint16.
 865    fmt: Desired compression encoding, e.g. 'png'.
 866    **kwargs: Options for `PIL.save()`, e.g. `optimize=True` for greater
 867      compression.
 868  """
 869  image = _as_valid_media_array(image)
 870  with io.BytesIO() as output:
 871    _pil_image(image).save(output, format=fmt, **kwargs)
 872    return output.getvalue()
 873
 874
 875def decompress_image(
 876    data: bytes, dtype: _DTypeLike = None, apply_exif_transpose: bool = True
 877) -> _NDArray:
 878  """Returns an image from a compressed data buffer.
 879
 880  Decoding is performed using `PIL`, which supports `uint8` images with 1, 3,
 881  or 4 channels and `uint16` images with a single channel.
 882
 883  Args:
 884    data: Buffer containing compressed image.
 885    dtype: Data type of the returned array.  If None, `np.uint8` or `np.uint16`
 886      is inferred automatically.
 887    apply_exif_transpose: If True, rotate image according to EXIF orientation.
 888  """
 889  pil_image: PIL.Image.Image = PIL.Image.open(io.BytesIO(data))
 890  if apply_exif_transpose:
 891    tmp_image = PIL.ImageOps.exif_transpose(pil_image)  # Future: in_place=True.
 892    assert tmp_image
 893    pil_image = tmp_image
 894  if dtype is None:
 895    dtype = np.uint16 if pil_image.mode.startswith('I') else np.uint8
 896  return np.array(pil_image, dtype=dtype)
 897
 898
 899def html_from_compressed_image(
 900    data: bytes,
 901    width: int,
 902    height: int,
 903    *,
 904    title: str | None = None,
 905    border: bool | str = False,
 906    pixelated: bool = True,
 907    fmt: str = 'png',
 908) -> str:
 909  """Returns an HTML string with an image tag containing encoded data.
 910
 911  Args:
 912    data: Compressed image bytes.
 913    width: Width of HTML image in pixels.
 914    height: Height of HTML image in pixels.
 915    title: Optional text shown centered above image.
 916    border: If `bool`, whether to place a black boundary around the image, or if
 917      `str`, the boundary CSS style.
 918    pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'.
 919    fmt: Compression encoding.
 920  """
 921  b64 = base64.b64encode(data).decode('utf-8')
 922  if isinstance(border, str):
 923    border = f'{border}; '
 924  elif border:
 925    border = 'border:1px solid black; '
 926  else:
 927    border = ''
 928  s_pixelated = 'pixelated' if pixelated else 'auto'
 929  s = (
 930      f'<img width="{width}" height="{height}"'
 931      f' style="{border}image-rendering:{s_pixelated}; object-fit:cover;"'
 932      f' src="data:image/{fmt};base64,{b64}"/>'
 933  )
 934  if title is not None:
 935    s = f"""<div style="display:flex; align-items:left;">
 936      <div style="display:flex; flex-direction:column; align-items:center;">
 937      <div>{title}</div><div>{s}</div></div></div>"""
 938  return s
 939
 940
 941def _get_width_height(
 942    width: int | None, height: int | None, shape: tuple[int, int]
 943) -> tuple[int, int]:
 944  """Returns (width, height) given optional parameters and image shape."""
 945  assert len(shape) == 2, shape
 946  if width and height:
 947    return width, height
 948  if width and not height:
 949    return width, int(width * (shape[0] / shape[1]) + 0.5)
 950  if height and not width:
 951    return int(height * (shape[1] / shape[0]) + 0.5), height
 952  return shape[::-1]
 953
 954
 955def _ensure_mapped_to_rgb(
 956    image: _ArrayLike,
 957    *,
 958    vmin: float | None = None,
 959    vmax: float | None = None,
 960    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
 961) -> _NDArray:
 962  """Ensure image is mapped to RGB."""
 963  image = _as_valid_media_array(image)
 964  if not (image.ndim == 2 or (image.ndim == 3 and image.shape[2] in (1, 3, 4))):
 965    raise ValueError(
 966        f'Image with shape {image.shape} is neither a 2D array'
 967        ' nor a 3D array with 1, 3, or 4 channels.'
 968    )
 969  if image.ndim == 3 and image.shape[2] == 1:
 970    image = image[:, :, 0]
 971  if image.ndim == 2:
 972    image = to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
 973  return image
 974
 975
 976def show_image(
 977    image: _ArrayLike, *, title: str | None = None, **kwargs: Any
 978) -> str | None:
 979  """Displays an image in the notebook and optionally saves it to a file.
 980
 981  See `show_images`.
 982
 983  >>> show_image(np.random.rand(100, 100))
 984  >>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8'))
 985  >>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100)
 986  >>> show_image(read_image('/tmp/image.png'))
 987  >>> url = 'https://github.com/hhoppe/data/raw/main/image.png'
 988  >>> show_image(read_image(url))
 989
 990  Args:
 991    image: 2D array-like, or 3D array-like with 1, 3, or 4 channels.
 992    title: Optional text shown centered above the image.
 993    **kwargs: See `show_images`.
 994
 995  Returns:
 996    html string if `return_html` is `True`.
 997  """
 998  return show_images([np.asarray(image)], [title], **kwargs)
 999
1000
1001def show_images(
1002    images: Iterable[_ArrayLike] | Mapping[str, _ArrayLike],
1003    titles: Iterable[str | None] | None = None,
1004    *,
1005    width: int | None = None,
1006    height: int | None = None,
1007    downsample: bool = True,
1008    columns: int | None = None,
1009    vmin: float | None = None,
1010    vmax: float | None = None,
1011    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
1012    border: bool | str = False,
1013    ylabel: str = '',
1014    html_class: str = 'show_images',
1015    pixelated: bool | None = None,
1016    return_html: bool = False,
1017) -> str | None:
1018  """Displays a row of images in the IPython/Jupyter notebook.
1019
1020  If a directory has been specified using `set_show_save_dir`, also saves each
1021  titled image to a file in that directory based on its title.
1022
1023  >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
1024  >>> show_images([image1, image2])
1025  >>> show_images({'random image': image1, 'color ramp': image2}, height=128)
1026  >>> show_images([image1, image2] * 5, columns=4, border=True)
1027
1028  Args:
1029    images: Iterable of images, or dictionary of `{title: image}`.  Each image
1030      must be either a 2D array or a 3D array with 1, 3, or 4 channels.
1031    titles: Optional strings shown above the corresponding images.
1032    width: Optional, overrides displayed width (in pixels).
1033    height: Optional, overrides displayed height (in pixels).
1034    downsample: If True, each image whose width or height is greater than the
1035      specified `width` or `height` is resampled to the display resolution. This
1036      improves antialiasing and reduces the size of the notebook.
1037    columns: Optional, maximum number of images per row.
1038    vmin: For single-channel image, explicit min value for display.
1039    vmax: For single-channel image, explicit max value for display.
1040    cmap: For single-channel image, `pyplot` color map or callable to map 1D to
1041      3D color.
1042    border: If `bool`, whether to place a black boundary around the image, or if
1043      `str`, the boundary CSS style.
1044    ylabel: Text (rotated by 90 degrees) shown on the left of each row.
1045    html_class: CSS class name used in definition of HTML element.
1046    pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if
1047      False, sets 'image-rendering: auto'; if None, uses pixelated rendering
1048      only on images for which `width` or `height` introduces magnification.
1049    return_html: If `True` return the raw HTML `str` instead of displaying.
1050
1051  Returns:
1052    html string if `return_html` is `True`.
1053  """
1054  if isinstance(images, Mapping):
1055    if titles is not None:
1056      raise ValueError('Cannot have images dictionary and titles parameter.')
1057    list_titles, list_images = list(images.keys()), list(images.values())
1058  else:
1059    list_images = list(images)
1060    list_titles = [None] * len(list_images) if titles is None else list(titles)
1061    if len(list_images) != len(list_titles):
1062      raise ValueError(
1063          'Number of images does not match number of titles'
1064          f' ({len(list_images)} vs {len(list_titles)}).'
1065      )
1066
1067  list_images = [
1068      _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
1069      for image in list_images
1070  ]
1071
1072  def maybe_downsample(image: _NDArray) -> _NDArray:
1073    shape = image.shape[0], image.shape[1]
1074    w, h = _get_width_height(width, height, shape)
1075    if w < shape[1] or h < shape[0]:
1076      image = resize_image(image, (h, w))
1077    return image
1078
1079  if downsample:
1080    list_images = [maybe_downsample(image) for image in list_images]
1081  png_datas = [compress_image(to_uint8(image)) for image in list_images]
1082
1083  for title, png_data in zip(list_titles, png_datas):
1084    if title is not None and _config.show_save_dir:
1085      path = pathlib.Path(_config.show_save_dir) / f'{title}.png'
1086      with _open(path, mode='wb') as f:
1087        f.write(png_data)
1088
1089  def html_from_compressed_images() -> str:
1090    html_strings = []
1091    for image, title, png_data in zip(list_images, list_titles, png_datas):
1092      w, h = _get_width_height(width, height, image.shape[:2])
1093      magnified = h > image.shape[0] or w > image.shape[1]
1094      pixelated2 = pixelated if pixelated is not None else magnified
1095      html_strings.append(
1096          html_from_compressed_image(
1097              png_data, w, h, title=title, border=border, pixelated=pixelated2
1098          )
1099      )
1100    # Create single-row tables each with no more than 'columns' elements.
1101    table_strings = []
1102    for row_html_strings in _chunked(html_strings, columns):
1103      td = '<td style="padding:1px;">'
1104      s = ''.join(f'{td}{e}</td>' for e in row_html_strings)
1105      if ylabel:
1106        style = 'writing-mode:vertical-lr; transform:rotate(180deg);'
1107        s = f'{td}<span style="{style}">{ylabel}</span></td>' + s
1108      table_strings.append(
1109          f'<table class="{html_class}"'
1110          f' style="border-spacing:0px;"><tr>{s}</tr></table>'
1111      )
1112    return ''.join(table_strings)
1113
1114  s = html_from_compressed_images()
1115  while len(s) > _IPYTHON_HTML_SIZE_LIMIT * 0.5:
1116    warnings.warn('mediapy: subsampling images to reduce HTML size')
1117    list_images = [image[::2, ::2] for image in list_images]
1118    png_datas = [compress_image(to_uint8(image)) for image in list_images]
1119    s = html_from_compressed_images()
1120  if return_html:
1121    return s
1122  _display_html(s)
1123  return None
1124
1125
1126def compare_images(
1127    images: Iterable[_ArrayLike],
1128    *,
1129    vmin: float | None = None,
1130    vmax: float | None = None,
1131    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
1132) -> None:
1133  """Compare two images using an interactive slider.
1134
1135  Displays an HTML slider component to interactively swipe between two images.
1136  The slider functionality requires that the web browser have Internet access.
1137  See additional info in `https://github.com/sneas/img-comparison-slider`.
1138
1139  >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
1140  >>> compare_images([image1, image2])
1141
1142  Args:
1143    images: Iterable of images.  Each image must be either a 2D array or a 3D
1144      array with 1, 3, or 4 channels.  There must be exactly two images.
1145    vmin: For single-channel image, explicit min value for display.
1146    vmax: For single-channel image, explicit max value for display.
1147    cmap: For single-channel image, `pyplot` color map or callable to map 1D to
1148      3D color.
1149  """
1150  list_images = [
1151      _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
1152      for image in images
1153  ]
1154  if len(list_images) != 2:
1155    raise ValueError('The number of images must be 2.')
1156  png_datas = [compress_image(to_uint8(image)) for image in list_images]
1157  b64_1, b64_2 = [
1158      base64.b64encode(png_data).decode('utf-8') for png_data in png_datas
1159  ]
1160  s = _IMAGE_COMPARISON_HTML.replace('{b64_1}', b64_1).replace('{b64_2}', b64_2)
1161  _display_html(s)
1162
1163
1164# ** Video I/O.
1165
1166
1167def _filename_suffix_from_codec(codec: str) -> str:
1168  if codec == 'gif':
1169    return '.gif'
1170  if codec == 'vp9':
1171    return '.webm'
1172
1173  return '.mp4'
1174
1175
1176def _get_ffmpeg_path() -> str:
1177  path = _search_for_ffmpeg_path()
1178  if not path:
1179    raise RuntimeError(
1180        f"Program '{_config.ffmpeg_name_or_path}' is not found;"
1181        " perhaps install ffmpeg using 'apt install ffmpeg'."
1182    )
1183  return path
1184
1185
1186@typing.overload
1187def _run_ffmpeg(
1188    ffmpeg_args: Sequence[str],
1189    stdin: int | None = None,
1190    stdout: int | None = None,
1191    stderr: int | None = None,
1192    encoding: None = None,  # No encoding -> bytes
1193    allowed_input_files: Sequence[str] | None = None,
1194    allowed_output_files: Sequence[str] | None = None,
1195) -> subprocess.Popen[bytes]:
1196  ...
1197
1198
1199@typing.overload
1200def _run_ffmpeg(
1201    ffmpeg_args: Sequence[str],
1202    stdin: int | None = None,
1203    stdout: int | None = None,
1204    stderr: int | None = None,
1205    encoding: str = ...,  # Encoding -> str
1206    allowed_input_files: Sequence[str] | None = None,
1207    allowed_output_files: Sequence[str] | None = None,
1208) -> subprocess.Popen[str]:
1209  ...
1210
1211
1212def _run_ffmpeg(
1213    ffmpeg_args: Sequence[str],
1214    stdin: int | None = None,
1215    stdout: int | None = None,
1216    stderr: int | None = None,
1217    encoding: str | None = None,
1218    allowed_input_files: Sequence[str] | None = None,
1219    allowed_output_files: Sequence[str] | None = None,
1220) -> subprocess.Popen[bytes] | subprocess.Popen[str]:
1221  """Runs ffmpeg with the given args.
1222
1223  Args:
1224    ffmpeg_args: The args to pass to ffmpeg.
1225    stdin: Same as in `subprocess.Popen`.
1226    stdout: Same as in `subprocess.Popen`.
1227    stderr: Same as in `subprocess.Popen`.
1228    encoding: Same as in `subprocess.Popen`.
1229    allowed_input_files: The input files to allow for ffmpeg.
1230    allowed_output_files: The output files to allow for ffmpeg.
1231
1232  Returns:
1233    The subprocess.Popen object with running ffmpeg process.
1234  """
1235  argv = []
1236  env: Any = {}
1237  ffmpeg_path = _get_ffmpeg_path()
1238
1239  # Allowed input and output files are not supported in open source.
1240  del allowed_input_files
1241  del allowed_output_files
1242
1243  argv.append(ffmpeg_path)
1244  argv.extend(ffmpeg_args)
1245
1246  return subprocess.Popen(
1247      argv,
1248      stdin=stdin,
1249      stdout=stdout,
1250      stderr=stderr,
1251      encoding=encoding,
1252      env=env,
1253  )
1254
1255
1256def video_is_available() -> bool:
1257  """Returns True if the program `ffmpeg` is found.
1258
1259  See also `set_ffmpeg`.
1260  """
1261  return _search_for_ffmpeg_path() is not None
1262
1263
1264class VideoMetadata(NamedTuple):
1265  """Represents the data stored in a video container header.
1266
1267  Attributes:
1268    num_images: Number of frames that is expected from the video stream.  This
1269      is estimated from the framerate and the duration stored in the video
1270      header, so it might be inexact.  We set the value to -1 if number of
1271      frames is not found in the header.
1272    shape: The dimensions (height, width) of each video frame.
1273    fps: The framerate in frames per second.
1274    bps: The estimated bitrate of the video stream in bits per second, retrieved
1275      from the video header.
1276  """
1277
1278  num_images: int
1279  shape: tuple[int, int]
1280  fps: float
1281  bps: int | None
1282
1283
1284def _get_video_metadata(path: _Path) -> VideoMetadata:
1285  """Returns attributes of video stored in the specified local file."""
1286  if not pathlib.Path(path).is_file():
1287    raise RuntimeError(f"Video file '{path}' is not found.")
1288
1289  command = [
1290      '-nostdin',
1291      '-i',
1292      str(path),
1293      '-acodec',
1294      'copy',
1295      # Necessary to get "frame= *(\d+)" using newer ffmpeg versions.
1296      # Previously, was `'-vcodec', 'copy'`
1297      '-vf',
1298      'select=1',
1299      '-vsync',
1300      '0',
1301      '-f',
1302      'null',
1303      '-',
1304  ]
1305  with _run_ffmpeg(
1306      command,
1307      allowed_input_files=[str(path)],
1308      stderr=subprocess.PIPE,
1309      encoding='utf-8',
1310  ) as proc:
1311    _, err = proc.communicate()
1312  bps = fps = num_images = width = height = rotation = None
1313  before_output_info = True
1314  for line in err.split('\n'):
1315    if line.startswith('Output '):
1316      before_output_info = False
1317    if match := re.search(r', bitrate: *([\d.]+) kb/s', line):
1318      bps = int(match.group(1)) * 1000
1319    if matches := re.findall(r'frame= *(\d+) ', line):
1320      num_images = int(matches[-1])
1321    if 'Stream #0:' in line and ': Video:' in line and before_output_info:
1322      if not (match := re.search(r', (\d+)x(\d+)', line)):
1323        raise RuntimeError(f'Unable to parse video dimensions in line {line}')
1324      width, height = int(match.group(1)), int(match.group(2))
1325      if match := re.search(r', ([\d.]+) fps', line):
1326        fps = float(match.group(1))
1327      elif str(path).endswith('.gif'):
1328        # Some GIF files lack a framerate attribute; use a reasonable default.
1329        fps = 10
1330      else:
1331        raise RuntimeError(f'Unable to parse video framerate in line {line}')
1332    if match := re.fullmatch(r'\s*rotate\s*:\s*(\d+)', line):
1333      rotation = int(match.group(1))
1334    if match := re.fullmatch(r'.*rotation of (-?\d+).*\sdegrees', line):
1335      rotation = int(match.group(1))
1336  if not num_images:
1337    num_images = -1
1338  if not width:
1339    raise RuntimeError(f'Unable to parse video header: {err}')
1340  # By default, ffmpeg enables "-autorotate"; we just fix the dimensions.
1341  if rotation in (90, 270, -90, -270):
1342    width, height = height, width
1343  assert height is not None and width is not None
1344  shape = height, width
1345  assert fps is not None
1346  return VideoMetadata(num_images, shape, fps, bps)
1347
1348
1349class _VideoIO:
1350  """Base class for `VideoReader` and `VideoWriter`."""
1351
1352  def _get_pix_fmt(self, dtype: _DType, image_format: str) -> str:
1353    """Returns ffmpeg pix_fmt given data type and image format."""
1354    native_endian_suffix = {'little': 'le', 'big': 'be'}[sys.byteorder]
1355    return {
1356        np.uint8: {
1357            'rgb': 'rgb24',
1358            'yuv': 'yuv444p',
1359            'gray': 'gray',
1360        },
1361        np.uint16: {
1362            'rgb': 'rgb48' + native_endian_suffix,
1363            'yuv': 'yuv444p16' + native_endian_suffix,
1364            'gray': 'gray16' + native_endian_suffix,
1365        },
1366    }[dtype.type][image_format]
1367
1368
1369class VideoReader(_VideoIO):
1370  """Context to read a compressed video as an iterable over its images.
1371
1372  >>> with VideoReader('/tmp/river.mp4') as reader:
1373  ...   print(f'Video has {reader.num_images} images with shape={reader.shape},'
1374  ...         f' at {reader.fps} frames/sec and {reader.bps} bits/sec.')
1375  ...   for image in reader:
1376  ...     print(image.shape)
1377
1378  >>> with VideoReader('/tmp/river.mp4') as reader:
1379  ...   video = np.array(tuple(reader))
1380
1381  >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
1382  >>> with VideoReader(url) as reader:
1383  ...   show_video(reader)
1384
1385  Attributes:
1386    path_or_url: Location of input video.
1387    output_format: Format of output images (default 'rgb').  If 'rgb', each
1388      image has shape=(height, width, 3) with R, G, B values.  If 'yuv', each
1389      image has shape=(height, width, 3) with Y, U, V values.  If 'gray', each
1390      image has shape=(height, width).
1391    dtype: Data type for output images.  The default is `np.uint8`.  Use of
1392      `np.uint16` allows reading 10-bit or 12-bit data without precision loss.
1393    metadata: Object storing the information retrieved from the video header.
1394      Its attributes are copied as attributes in this class.
1395    num_images: Number of frames that is expected from the video stream.  This
1396      is estimated from the framerate and the duration stored in the video
1397      header, so it might be inexact.
1398    shape: The dimensions (height, width) of each video frame.
1399    fps: The framerate in frames per second.
1400    bps: The estimated bitrate of the video stream in bits per second, retrieved
1401      from the video header.
1402    stream_index: The stream index to read from. The default is 0.
1403  """
1404
1405  path_or_url: _Path
1406  output_format: str
1407  dtype: _DType
1408  metadata: VideoMetadata
1409  num_images: int
1410  shape: tuple[int, int]
1411  fps: float
1412  bps: int | None
1413  stream_index: int
1414  _num_bytes_per_image: int
1415
1416  def __init__(
1417      self,
1418      path_or_url: _Path,
1419      *,
1420      stream_index: int = 0,
1421      output_format: str = 'rgb',
1422      dtype: _DTypeLike = np.uint8,
1423  ):
1424    if output_format not in {'rgb', 'yuv', 'gray'}:
1425      raise ValueError(
1426          f'Output format {output_format} is not rgb, yuv, or gray.'
1427      )
1428    self.path_or_url = path_or_url
1429    self.output_format = output_format
1430    self.stream_index = stream_index
1431    self.dtype = np.dtype(dtype)
1432    if self.dtype.type not in (np.uint8, np.uint16):
1433      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1434    self._read_via_local_file: Any = None
1435    self._popen: subprocess.Popen[bytes] | None = None
1436    self._proc: subprocess.Popen[bytes] | None = None
1437
1438  def __enter__(self) -> 'VideoReader':
1439    try:
1440      self._read_via_local_file = _read_via_local_file(self.path_or_url)
1441      # pylint: disable-next=no-member
1442      tmp_name = self._read_via_local_file.__enter__()
1443
1444      self.metadata = _get_video_metadata(tmp_name)
1445      self.num_images, self.shape, self.fps, self.bps = self.metadata
1446      pix_fmt = self._get_pix_fmt(self.dtype, self.output_format)
1447      num_channels = {'rgb': 3, 'yuv': 3, 'gray': 1}[self.output_format]
1448      bytes_per_channel = self.dtype.itemsize
1449      self._num_bytes_per_image = (
1450          math.prod(self.shape) * num_channels * bytes_per_channel
1451      )
1452
1453      command = [
1454          '-v',
1455          'panic',
1456          '-nostdin',
1457          '-i',
1458          tmp_name,
1459          '-vcodec',
1460          'rawvideo',
1461          '-f',
1462          'image2pipe',
1463          '-map',
1464          f'0:v:{self.stream_index}',
1465          '-pix_fmt',
1466          pix_fmt,
1467          '-vsync',
1468          'vfr',
1469          '-',
1470      ]
1471      self._popen = _run_ffmpeg(
1472          command,
1473          stdout=subprocess.PIPE,
1474          stderr=subprocess.PIPE,
1475          allowed_input_files=[tmp_name],
1476      )
1477      self._proc = self._popen.__enter__()
1478    except Exception:
1479      self.__exit__(None, None, None)
1480      raise
1481    return self
1482
1483  def __exit__(self, *_: Any) -> None:
1484    self.close()
1485
1486  def read(self) -> _NDArray | None:
1487    """Reads a video image frame (or None if at end of file).
1488
1489    Returns:
1490      A numpy array in the format specified by `output_format`, i.e., a 3D
1491      array with 3 color channels, except for format 'gray' which is 2D.
1492    """
1493    assert self._proc, 'Error: reading from an already closed context.'
1494    stdout = self._proc.stdout
1495    assert stdout is not None
1496    data = stdout.read(self._num_bytes_per_image)
1497    if not data:  # Due to either end-of-file or subprocess error.
1498      self.close()  # Raises exception if subprocess had error.
1499      return None  # To indicate end-of-file.
1500    assert len(data) == self._num_bytes_per_image
1501    image = np.frombuffer(data, dtype=self.dtype)
1502    if self.output_format == 'rgb':
1503      image = image.reshape(*self.shape, 3)
1504    elif self.output_format == 'yuv':  # Convert from planar YUV to pixel YUV.
1505      image = np.moveaxis(image.reshape(3, *self.shape), 0, 2)
1506    elif self.output_format == 'gray':  # Generate 2D rather than 3D ndimage.
1507      image = image.reshape(*self.shape)
1508    else:
1509      raise AssertionError
1510    return image
1511
1512  def __iter__(self) -> Iterator[_NDArray]:
1513    while True:
1514      image = self.read()
1515      if image is None:
1516        return
1517      yield image
1518
1519  def close(self) -> None:
1520    """Terminates video reader.  (Called automatically at end of context.)"""
1521    if self._popen:
1522      self._popen.__exit__(None, None, None)
1523      self._popen = None
1524      self._proc = None
1525    if self._read_via_local_file:
1526      # pylint: disable-next=no-member
1527      self._read_via_local_file.__exit__(None, None, None)
1528      self._read_via_local_file = None
1529
1530
1531class VideoWriter(_VideoIO):
1532  """Context to write a compressed video.
1533
1534  >>> shape = 480, 640
1535  >>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer:
1536  ...   for image in moving_circle(shape, num_images=60):
1537  ...     writer.add_image(image)
1538  >>> show_video(read_video('/tmp/v.mp4'))
1539
1540
1541  Bitrate control may be specified using at most one of: `bps`, `qp`, or `crf`.
1542  If none are specified, `qp` is set to a default value.
1543  See https://slhck.info/video/2017/03/01/rate-control.html
1544
1545  If codec is 'gif', the args `bps`, `qp`, `crf`, and `encoded_format` are
1546  ignored.
1547
1548  Attributes:
1549    path: Output video.  Its suffix (e.g. '.mp4') determines the video container
1550      format.  The suffix must be '.gif' if the codec is 'gif'.
1551    shape: 2D spatial dimensions (height, width) of video image frames.  The
1552      dimensions must be even if 'encoded_format' has subsampled chroma (e.g.,
1553      'yuv420p' or 'yuv420p10le').
1554    codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264',
1555      'hevc', 'vp9', or 'gif').
1556    metadata: Optional VideoMetadata object whose `fps` and `bps` attributes are
1557      used if not specified as explicit parameters.
1558    fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif').
1559    bps: Requested average bits-per-second bitrate (default None).
1560    qp: Quantization parameter for video compression quality (default None).
1561    crf: Constant rate factor for video compression quality (default None).
1562    ffmpeg_args: Additional arguments for `ffmpeg` command, e.g. '-g 30' to
1563      introduce I-frames, or '-bf 0' to omit B-frames.
1564    input_format: Format of input images (default 'rgb').  If 'rgb', each image
1565      has shape=(height, width, 3) or (height, width).  If 'yuv', each image has
1566      shape=(height, width, 3) with Y, U, V values.  If 'gray', each image has
1567      shape=(height, width).
1568    dtype: Expected data type for input images (any float input images are
1569      converted to `dtype`).  The default is `np.uint8`.  Use of `np.uint16` is
1570      necessary when encoding >8 bits/channel.
1571    encoded_format: Pixel format as defined by `ffmpeg -pix_fmts`, e.g.,
1572      'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma),
1573      'yuv420p10le' (10-bit per channel), etc.  The default (None) selects
1574      'yuv420p' if all shape dimensions are even, else 'yuv444p'.
1575  """
1576
1577  def __init__(
1578      self,
1579      path: _Path,
1580      shape: tuple[int, int],
1581      *,
1582      codec: str = 'h264',
1583      metadata: VideoMetadata | None = None,
1584      fps: float | None = None,
1585      bps: int | None = None,
1586      qp: int | None = None,
1587      crf: float | None = None,
1588      ffmpeg_args: str | Sequence[str] = '',
1589      input_format: str = 'rgb',
1590      dtype: _DTypeLike = np.uint8,
1591      encoded_format: str | None = None,
1592  ) -> None:
1593    _check_2d_shape(shape)
1594    if fps is None and metadata:
1595      fps = metadata.fps
1596    if fps is None:
1597      fps = 25.0 if codec == 'gif' else 60.0
1598    if fps <= 0.0:
1599      raise ValueError(f'Frame-per-second value {fps} is invalid.')
1600    if bps is None and metadata:
1601      bps = metadata.bps
1602    bps = int(bps) if bps is not None else None
1603    if bps is not None and bps <= 0:
1604      raise ValueError(f'Bitrate value {bps} is invalid.')
1605    if qp is not None and (not isinstance(qp, int) or qp < 0):
1606      raise ValueError(
1607          f'Quantization parameter {qp} cannot be negative. It must be a'
1608          ' non-negative integer.'
1609      )
1610    num_rate_specifications = sum(x is not None for x in (bps, qp, crf))
1611    if num_rate_specifications > 1:
1612      raise ValueError(
1613          f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).'
1614      )
1615    ffmpeg_args = (
1616        shlex.split(ffmpeg_args)
1617        if isinstance(ffmpeg_args, str)
1618        else list(ffmpeg_args)
1619    )
1620    if input_format not in {'rgb', 'yuv', 'gray'}:
1621      raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.')
1622    dtype = np.dtype(dtype)
1623    if dtype.type not in (np.uint8, np.uint16):
1624      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1625    self.path = pathlib.Path(path)
1626    self.shape = shape
1627    all_dimensions_are_even = all(dim % 2 == 0 for dim in shape)
1628    if encoded_format is None:
1629      encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p'
1630    if not all_dimensions_are_even and encoded_format.startswith(
1631        ('yuv42', 'yuvj42')
1632    ):
1633      raise ValueError(
1634          f'With encoded_format {encoded_format}, video dimensions must be'
1635          f' even, but shape is {shape}.'
1636      )
1637    self.fps = fps
1638    self.codec = codec
1639    self.bps = bps
1640    self.qp = qp
1641    self.crf = crf
1642    self.ffmpeg_args = ffmpeg_args
1643    self.input_format = input_format
1644    self.dtype = dtype
1645    self.encoded_format = encoded_format
1646    if num_rate_specifications == 0 and not ffmpeg_args:
1647      qp = 20 if math.prod(self.shape) <= 640 * 480 else 28
1648    self._bitrate_args = (
1649        (['-vb', f'{bps}'] if bps is not None else [])
1650        + (['-qp', f'{qp}'] if qp is not None else [])
1651        + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else [])
1652    )
1653    if self.codec == 'gif':
1654      if self.path.suffix != '.gif':
1655        raise ValueError(f"File '{self.path}' does not have a .gif suffix.")
1656      self.encoded_format = 'pal8'
1657      self._bitrate_args = []
1658      video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse'
1659      # Less common (and likely less useful) is a per-frame color palette:
1660      # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];'
1661      #                 '[s1][p]paletteuse=new=1')
1662      self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args
1663    self._write_via_local_file: Any = None
1664    self._popen: subprocess.Popen[bytes] | None = None
1665    self._proc: subprocess.Popen[bytes] | None = None
1666
1667  def __enter__(self) -> 'VideoWriter':
1668    input_pix_fmt = self._get_pix_fmt(self.dtype, self.input_format)
1669    try:
1670      self._write_via_local_file = _write_via_local_file(self.path)
1671      # pylint: disable-next=no-member
1672      tmp_name = self._write_via_local_file.__enter__()
1673
1674      # Writing to stdout using ('-f', 'mp4', '-') would require
1675      # ('-movflags', 'frag_keyframe+empty_moov') and the result is nonportable.
1676      height, width = self.shape
1677      command = (
1678          [
1679              '-v',
1680              'error',
1681              '-f',
1682              'rawvideo',
1683              '-vcodec',
1684              'rawvideo',
1685              '-pix_fmt',
1686              input_pix_fmt,
1687              '-s',
1688              f'{width}x{height}',
1689              '-r',
1690              f'{self.fps}',
1691              '-i',
1692              '-',
1693              '-an',
1694              '-vcodec',
1695              self.codec,
1696              '-pix_fmt',
1697              self.encoded_format,
1698          ]
1699          + self._bitrate_args
1700          + self.ffmpeg_args
1701          + ['-y', tmp_name]
1702      )
1703      self._popen = _run_ffmpeg(
1704          command,
1705          stdin=subprocess.PIPE,
1706          stderr=subprocess.PIPE,
1707          allowed_output_files=[tmp_name],
1708      )
1709      self._proc = self._popen.__enter__()
1710    except Exception:
1711      self.__exit__(None, None, None)
1712      raise
1713    return self
1714
1715  def __exit__(self, *_: Any) -> None:
1716    self.close()
1717
1718  def add_image(self, image: _NDArray) -> None:
1719    """Writes a video frame.
1720
1721    Args:
1722      image: Array whose dtype and first two dimensions must match the `dtype`
1723        and `shape` specified in `VideoWriter` initialization.  If
1724        `input_format` is 'gray', the image must be 2D.  For the 'rgb'
1725        input_format, the image may be either 2D (interpreted as grayscale) or
1726        3D with three (R, G, B) channels.  For the 'yuv' input_format, the image
1727        must be 3D with three (Y, U, V) channels.
1728
1729    Raises:
1730      RuntimeError: If there is an error writing to the output file.
1731    """
1732    assert self._proc, 'Error: writing to an already closed context.'
1733    if issubclass(image.dtype.type, (np.floating, np.bool_)):
1734      image = to_type(image, self.dtype)
1735    if image.dtype != self.dtype:
1736      raise ValueError(f'Image type {image.dtype} != {self.dtype}.')
1737    if self.input_format == 'gray':
1738      if image.ndim != 2:
1739        raise ValueError(f'Image dimensions {image.shape} are not 2D.')
1740    else:
1741      if image.ndim == 2 and self.input_format == 'rgb':
1742        image = np.dstack((image, image, image))
1743      if not (image.ndim == 3 and image.shape[2] == 3):
1744        raise ValueError(f'Image dimensions {image.shape} are invalid.')
1745    if image.shape[:2] != self.shape:
1746      raise ValueError(
1747          f'Image dimensions {image.shape[:2]} do not match'
1748          f' those of the initialized video {self.shape}.'
1749      )
1750    if self.input_format == 'yuv':  # Convert from per-pixel YUV to planar YUV.
1751      image = np.moveaxis(image, 2, 0)
1752    data = image.tobytes()
1753    stdin = self._proc.stdin
1754    assert stdin is not None
1755    if stdin.write(data) != len(data):
1756      self._proc.wait()
1757      stderr = self._proc.stderr
1758      assert stderr is not None
1759      s = stderr.read().decode('utf-8')
1760      raise RuntimeError(f"Error writing '{self.path}': {s}")
1761
1762  def close(self) -> None:
1763    """Finishes writing the video.  (Called automatically at end of context.)"""
1764    if self._popen:
1765      assert self._proc, 'Error: closing an already closed context.'
1766      stdin = self._proc.stdin
1767      assert stdin is not None
1768      stdin.close()
1769      if self._proc.wait():
1770        stderr = self._proc.stderr
1771        assert stderr is not None
1772        s = stderr.read().decode('utf-8')
1773        raise RuntimeError(f"Error writing '{self.path}': {s}")
1774      self._popen.__exit__(None, None, None)
1775      self._popen = None
1776      self._proc = None
1777    if self._write_via_local_file:
1778      # pylint: disable-next=no-member
1779      self._write_via_local_file.__exit__(None, None, None)
1780      self._write_via_local_file = None
1781
1782
1783class _VideoArray(npt.NDArray[Any]):
1784  """Wrapper to add a VideoMetadata `metadata` attribute to a numpy array."""
1785
1786  metadata: VideoMetadata | None
1787
1788  def __new__(
1789      cls: Type['_VideoArray'],
1790      input_array: _NDArray,
1791      metadata: VideoMetadata | None = None,
1792  ) -> '_VideoArray':
1793    obj: _VideoArray = np.asarray(input_array).view(cls)
1794    obj.metadata = metadata
1795    return obj
1796
1797  def __array_finalize__(self, obj: Any) -> None:
1798    if obj is None:
1799      return
1800    self.metadata = getattr(obj, 'metadata', None)
1801
1802
1803def read_video(path_or_url: _Path, **kwargs: Any) -> _VideoArray:
1804  """Returns an array containing all images read from a compressed video file.
1805
1806  >>> video = read_video('/tmp/river.mp4')
1807  >>> print(f'The framerate is {video.metadata.fps} frames/s.')
1808  >>> show_video(video)
1809
1810  >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
1811  >>> show_video(read_video(url))
1812
1813  Args:
1814    path_or_url: Input video file.
1815    **kwargs: Additional parameters for `VideoReader`.
1816
1817  Returns:
1818    A 4D `numpy` array with dimensions (frame, height, width, channel), or a 3D
1819    array if `output_format` is specified as 'gray'.  The returned array has an
1820    attribute `metadata` containing `VideoMetadata` information.  This enables
1821    `show_video` to retrieve the framerate in `metadata.fps`.  Note that the
1822    metadata attribute is lost in most subsequent `numpy` operations.
1823  """
1824  with VideoReader(path_or_url, **kwargs) as reader:
1825    return _VideoArray(np.array(tuple(reader)), metadata=reader.metadata)
1826
1827
1828def write_video(path: _Path, images: Iterable[_NDArray], **kwargs: Any) -> None:
1829  """Writes images to a compressed video file.
1830
1831  >>> video = moving_circle((480, 640), num_images=60)
1832  >>> write_video('/tmp/v.mp4', video, fps=60, qp=18)
1833  >>> show_video(read_video('/tmp/v.mp4'))
1834
1835  Args:
1836    path: Output video file.
1837    images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D
1838      arrays.
1839    **kwargs: Additional parameters for `VideoWriter`.
1840  """
1841  first_image, images = _peek_first(images)
1842  shape = first_image.shape[0], first_image.shape[1]
1843  dtype = first_image.dtype
1844  if dtype == bool:
1845    dtype = np.dtype(np.uint8)
1846  elif np.issubdtype(dtype, np.floating):
1847    dtype = np.dtype(np.uint16)
1848  kwargs = {'metadata': getattr(images, 'metadata', None), **kwargs}
1849  with VideoWriter(path, shape=shape, dtype=dtype, **kwargs) as writer:
1850    for image in images:
1851      writer.add_image(image)
1852
1853
1854def compress_video(
1855    images: Iterable[_NDArray], *, codec: str = 'h264', **kwargs: Any
1856) -> bytes:
1857  """Returns a buffer containing a compressed video.
1858
1859  The video container is 'gif' for 'gif' codec, 'webm' for 'vp9' codec,
1860  and mp4 otherwise.
1861
1862  >>> video = read_video('/tmp/river.mp4')
1863  >>> data = compress_video(video, bps=10_000_000)
1864  >>> print(len(data))
1865
1866  >>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10)
1867
1868  Args:
1869    images: Iterable over video frames.
1870    codec: Compression algorithm as defined by `ffmpeg -codecs` (e.g., 'h264',
1871      'hevc', 'vp9', or 'gif').
1872    **kwargs: Additional parameters for `VideoWriter`.
1873
1874  Returns:
1875    A bytes buffer containing the compressed video.
1876  """
1877  suffix = _filename_suffix_from_codec(codec)
1878  with tempfile.TemporaryDirectory() as directory_name:
1879    tmp_path = pathlib.Path(directory_name) / f'file{suffix}'
1880    write_video(tmp_path, images, codec=codec, **kwargs)
1881    return tmp_path.read_bytes()
1882
1883
1884def decompress_video(data: bytes, **kwargs: Any) -> _NDArray:
1885  """Returns video images from an MP4-compressed data buffer."""
1886  with tempfile.TemporaryDirectory() as directory_name:
1887    tmp_path = pathlib.Path(directory_name) / 'file.mp4'
1888    tmp_path.write_bytes(data)
1889    return read_video(tmp_path, **kwargs)
1890
1891
1892def html_from_compressed_video(
1893    data: bytes,
1894    width: int,
1895    height: int,
1896    *,
1897    title: str | None = None,
1898    border: bool | str = False,
1899    loop: bool = True,
1900    autoplay: bool = True,
1901) -> str:
1902  """Returns an HTML string with a video tag containing H264-encoded data.
1903
1904  Args:
1905    data: MP4-compressed video bytes.
1906    width: Width of HTML video in pixels.
1907    height: Height of HTML video in pixels.
1908    title: Optional text shown centered above the video.
1909    border: If `bool`, whether to place a black boundary around the image, or if
1910      `str`, the boundary CSS style.
1911    loop: If True, the playback repeats forever.
1912    autoplay: If True, video playback starts without having to click.
1913  """
1914  b64 = base64.b64encode(data).decode('utf-8')
1915  if isinstance(border, str):
1916    border = f'{border}; '
1917  elif border:
1918    border = 'border:1px solid black; '
1919  else:
1920    border = ''
1921  options = (
1922      f'controls width="{width}" height="{height}"'
1923      f' style="{border}object-fit:cover;"'
1924      f'{" loop" if loop else ""}'
1925      f'{" autoplay muted" if autoplay else ""}'
1926  )
1927  s = f"""<video {options}>
1928      <source src="data:video/mp4;base64,{b64}" type="video/mp4"/>
1929      This browser does not support the video tag.
1930      </video>"""
1931  if title is not None:
1932    s = f"""<div style="display:flex; align-items:left;">
1933      <div style="display:flex; flex-direction:column; align-items:center;">
1934      <div>{title}</div><div>{s}</div></div></div>"""
1935  return s
1936
1937
1938def show_video(
1939    images: Iterable[_NDArray], *, title: str | None = None, **kwargs: Any
1940) -> str | None:
1941  """Displays a video in the IPython notebook and optionally saves it to a file.
1942
1943  See `show_videos`.
1944
1945  >>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4')
1946  >>> show_video(video, title='River video')
1947
1948  >>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True)
1949
1950  >>> show_video(read_video('/tmp/river.mp4'))
1951
1952  Args:
1953    images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D
1954      arrays).
1955    title: Optional text shown centered above the video.
1956    **kwargs: See `show_videos`.
1957
1958  Returns:
1959    html string if `return_html` is `True`.
1960  """
1961  return show_videos([images], [title], **kwargs)
1962
1963
1964def show_videos(
1965    videos: Iterable[Iterable[_NDArray]] | Mapping[str, Iterable[_NDArray]],
1966    titles: Iterable[str | None] | None = None,
1967    *,
1968    width: int | None = None,
1969    height: int | None = None,
1970    downsample: bool = True,
1971    columns: int | None = None,
1972    fps: float | None = None,
1973    bps: int | None = None,
1974    qp: int | None = None,
1975    codec: str = 'h264',
1976    ylabel: str = '',
1977    html_class: str = 'show_videos',
1978    return_html: bool = False,
1979    **kwargs: Any,
1980) -> str | None:
1981  """Displays a row of videos in the IPython notebook.
1982
1983  Creates HTML with `<video>` tags containing embedded H264-encoded bytestrings.
1984  If `codec` is set to 'gif', we instead use `<img>` tags containing embedded
1985  GIF-encoded bytestrings.  Note that the resulting GIF animations skip frames
1986  when the `fps` period is not a multiple of 10 ms units (GIF frame delay
1987  units).  Encoding at `fps` = 20.0, 25.0, or 50.0 works fine.
1988
1989  If a directory has been specified using `set_show_save_dir`, also saves each
1990  titled video to a file in that directory based on its title.
1991
1992  Args:
1993    videos: Iterable of videos, or dictionary of `{title: video}`.  Each video
1994      must be an iterable of images.  If a video object has a `metadata`
1995      (`VideoMetadata`) attribute, its `fps` field provides a default framerate.
1996    titles: Optional strings shown above the corresponding videos.
1997    width: Optional, overrides displayed width (in pixels).
1998    height: Optional, overrides displayed height (in pixels).
1999    downsample: If True, each video whose width or height is greater than the
2000      specified `width` or `height` is resampled to the display resolution. This
2001      improves antialiasing and reduces the size of the notebook.
2002    columns: Optional, maximum number of videos per row.
2003    fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF).
2004    bps: Bits-per-second bitrate (default None).
2005    qp: Quantization parameter for video compression quality (default None).
2006    codec: Compression algorithm; must be either 'h264' or 'gif'.
2007    ylabel: Text (rotated by 90 degrees) shown on the left of each row.
2008    html_class: CSS class name used in definition of HTML element.
2009    return_html: If `True` return the raw HTML `str` instead of displaying.
2010    **kwargs: Additional parameters (`border`, `loop`, `autoplay`) for
2011      `html_from_compressed_video`.
2012
2013  Returns:
2014    html string if `return_html` is `True`.
2015  """
2016  if isinstance(videos, Mapping):
2017    if titles is not None:
2018      raise ValueError(
2019          'Cannot have both a video dictionary and a titles parameter.'
2020      )
2021    list_titles = list(videos.keys())
2022    list_videos = list(videos.values())
2023  else:
2024    list_videos = list(cast('Iterable[_NDArray]', videos))
2025    list_titles = [None] * len(list_videos) if titles is None else list(titles)
2026    if len(list_videos) != len(list_titles):
2027      raise ValueError(
2028          'Number of videos does not match number of titles'
2029          f' ({len(list_videos)} vs {len(list_titles)}).'
2030      )
2031  if codec not in {'h264', 'gif'}:
2032    raise ValueError(f'Codec {codec} is neither h264 or gif.')
2033
2034  html_strings = []
2035  for video, title in zip(list_videos, list_titles):
2036    metadata: VideoMetadata | None = getattr(video, 'metadata', None)
2037    first_image, video = _peek_first(video)
2038    w, h = _get_width_height(width, height, first_image.shape[:2])
2039    if downsample and (w < first_image.shape[1] or h < first_image.shape[0]):
2040      # Not resize_video() because each image may have different depth and type.
2041      video = [resize_image(image, (h, w)) for image in video]
2042      first_image = video[0]
2043    data = compress_video(
2044        video, metadata=metadata, fps=fps, bps=bps, qp=qp, codec=codec
2045    )
2046    if title is not None and _config.show_save_dir:
2047      suffix = _filename_suffix_from_codec(codec)
2048      path = pathlib.Path(_config.show_save_dir) / f'{title}{suffix}'
2049      with _open(path, mode='wb') as f:
2050        f.write(data)
2051    if codec == 'gif':
2052      pixelated = h > first_image.shape[0] or w > first_image.shape[1]
2053      html_string = html_from_compressed_image(
2054          data, w, h, title=title, fmt='gif', pixelated=pixelated, **kwargs
2055      )
2056    else:
2057      html_string = html_from_compressed_video(
2058          data, w, h, title=title, **kwargs
2059      )
2060    html_strings.append(html_string)
2061
2062  # Create single-row tables each with no more than 'columns' elements.
2063  table_strings = []
2064  for row_html_strings in _chunked(html_strings, columns):
2065    td = '<td style="padding:1px;">'
2066    s = ''.join(f'{td}{e}</td>' for e in row_html_strings)
2067    if ylabel:
2068      style = 'writing-mode:vertical-lr; transform:rotate(180deg);'
2069      s = f'{td}<span style="{style}">{ylabel}</span></td>' + s
2070    table_strings.append(
2071        f'<table class="{html_class}"'
2072        f' style="border-spacing:0px;"><tr>{s}</tr></table>'
2073    )
2074  s = ''.join(table_strings)
2075  if return_html:
2076    return s
2077  _display_html(s)
2078  return None
2079
2080
2081# Local Variables:
2082# fill-column: 80
2083# End:
def show_image( image: ArrayLike, *, title: str | None = None, **kwargs: Any) -> str | None:
977def show_image(
978    image: _ArrayLike, *, title: str | None = None, **kwargs: Any
979) -> str | None:
980  """Displays an image in the notebook and optionally saves it to a file.
981
982  See `show_images`.
983
984  >>> show_image(np.random.rand(100, 100))
985  >>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8'))
986  >>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100)
987  >>> show_image(read_image('/tmp/image.png'))
988  >>> url = 'https://github.com/hhoppe/data/raw/main/image.png'
989  >>> show_image(read_image(url))
990
991  Args:
992    image: 2D array-like, or 3D array-like with 1, 3, or 4 channels.
993    title: Optional text shown centered above the image.
994    **kwargs: See `show_images`.
995
996  Returns:
997    html string if `return_html` is `True`.
998  """
999  return show_images([np.asarray(image)], [title], **kwargs)

Displays an image in the notebook and optionally saves it to a file.

See show_images.

>>> show_image(np.random.rand(100, 100))
>>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8'))
>>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100)
>>> show_image(read_image('/tmp/image.png'))
>>> url = 'https://github.com/hhoppe/data/raw/main/image.png'
>>> show_image(read_image(url))
Arguments:
  • image: 2D array-like, or 3D array-like with 1, 3, or 4 channels.
  • title: Optional text shown centered above the image.
  • **kwargs: See show_images.
Returns:

html string if return_html is True.

def show_images( images: Iterable[ArrayLike] | Mapping[str, ArrayLike], titles: Iterable[str | None] | None = None, *, width: int | None = None, height: int | None = None, downsample: bool = True, columns: int | None = None, vmin: float | None = None, vmax: float | None = None, cmap: str | Callable[[ArrayLike], np.ndarray] = 'gray', border: bool | str = False, ylabel: str = '', html_class: str = 'show_images', pixelated: bool | None = None, return_html: bool = False) -> str | None:
1002def show_images(
1003    images: Iterable[_ArrayLike] | Mapping[str, _ArrayLike],
1004    titles: Iterable[str | None] | None = None,
1005    *,
1006    width: int | None = None,
1007    height: int | None = None,
1008    downsample: bool = True,
1009    columns: int | None = None,
1010    vmin: float | None = None,
1011    vmax: float | None = None,
1012    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
1013    border: bool | str = False,
1014    ylabel: str = '',
1015    html_class: str = 'show_images',
1016    pixelated: bool | None = None,
1017    return_html: bool = False,
1018) -> str | None:
1019  """Displays a row of images in the IPython/Jupyter notebook.
1020
1021  If a directory has been specified using `set_show_save_dir`, also saves each
1022  titled image to a file in that directory based on its title.
1023
1024  >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
1025  >>> show_images([image1, image2])
1026  >>> show_images({'random image': image1, 'color ramp': image2}, height=128)
1027  >>> show_images([image1, image2] * 5, columns=4, border=True)
1028
1029  Args:
1030    images: Iterable of images, or dictionary of `{title: image}`.  Each image
1031      must be either a 2D array or a 3D array with 1, 3, or 4 channels.
1032    titles: Optional strings shown above the corresponding images.
1033    width: Optional, overrides displayed width (in pixels).
1034    height: Optional, overrides displayed height (in pixels).
1035    downsample: If True, each image whose width or height is greater than the
1036      specified `width` or `height` is resampled to the display resolution. This
1037      improves antialiasing and reduces the size of the notebook.
1038    columns: Optional, maximum number of images per row.
1039    vmin: For single-channel image, explicit min value for display.
1040    vmax: For single-channel image, explicit max value for display.
1041    cmap: For single-channel image, `pyplot` color map or callable to map 1D to
1042      3D color.
1043    border: If `bool`, whether to place a black boundary around the image, or if
1044      `str`, the boundary CSS style.
1045    ylabel: Text (rotated by 90 degrees) shown on the left of each row.
1046    html_class: CSS class name used in definition of HTML element.
1047    pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if
1048      False, sets 'image-rendering: auto'; if None, uses pixelated rendering
1049      only on images for which `width` or `height` introduces magnification.
1050    return_html: If `True` return the raw HTML `str` instead of displaying.
1051
1052  Returns:
1053    html string if `return_html` is `True`.
1054  """
1055  if isinstance(images, Mapping):
1056    if titles is not None:
1057      raise ValueError('Cannot have images dictionary and titles parameter.')
1058    list_titles, list_images = list(images.keys()), list(images.values())
1059  else:
1060    list_images = list(images)
1061    list_titles = [None] * len(list_images) if titles is None else list(titles)
1062    if len(list_images) != len(list_titles):
1063      raise ValueError(
1064          'Number of images does not match number of titles'
1065          f' ({len(list_images)} vs {len(list_titles)}).'
1066      )
1067
1068  list_images = [
1069      _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
1070      for image in list_images
1071  ]
1072
1073  def maybe_downsample(image: _NDArray) -> _NDArray:
1074    shape = image.shape[0], image.shape[1]
1075    w, h = _get_width_height(width, height, shape)
1076    if w < shape[1] or h < shape[0]:
1077      image = resize_image(image, (h, w))
1078    return image
1079
1080  if downsample:
1081    list_images = [maybe_downsample(image) for image in list_images]
1082  png_datas = [compress_image(to_uint8(image)) for image in list_images]
1083
1084  for title, png_data in zip(list_titles, png_datas):
1085    if title is not None and _config.show_save_dir:
1086      path = pathlib.Path(_config.show_save_dir) / f'{title}.png'
1087      with _open(path, mode='wb') as f:
1088        f.write(png_data)
1089
1090  def html_from_compressed_images() -> str:
1091    html_strings = []
1092    for image, title, png_data in zip(list_images, list_titles, png_datas):
1093      w, h = _get_width_height(width, height, image.shape[:2])
1094      magnified = h > image.shape[0] or w > image.shape[1]
1095      pixelated2 = pixelated if pixelated is not None else magnified
1096      html_strings.append(
1097          html_from_compressed_image(
1098              png_data, w, h, title=title, border=border, pixelated=pixelated2
1099          )
1100      )
1101    # Create single-row tables each with no more than 'columns' elements.
1102    table_strings = []
1103    for row_html_strings in _chunked(html_strings, columns):
1104      td = '<td style="padding:1px;">'
1105      s = ''.join(f'{td}{e}</td>' for e in row_html_strings)
1106      if ylabel:
1107        style = 'writing-mode:vertical-lr; transform:rotate(180deg);'
1108        s = f'{td}<span style="{style}">{ylabel}</span></td>' + s
1109      table_strings.append(
1110          f'<table class="{html_class}"'
1111          f' style="border-spacing:0px;"><tr>{s}</tr></table>'
1112      )
1113    return ''.join(table_strings)
1114
1115  s = html_from_compressed_images()
1116  while len(s) > _IPYTHON_HTML_SIZE_LIMIT * 0.5:
1117    warnings.warn('mediapy: subsampling images to reduce HTML size')
1118    list_images = [image[::2, ::2] for image in list_images]
1119    png_datas = [compress_image(to_uint8(image)) for image in list_images]
1120    s = html_from_compressed_images()
1121  if return_html:
1122    return s
1123  _display_html(s)
1124  return None

Displays a row of images in the IPython/Jupyter notebook.

If a directory has been specified using set_show_save_dir, also saves each titled image to a file in that directory based on its title.

>>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
>>> show_images([image1, image2])
>>> show_images({'random image': image1, 'color ramp': image2}, height=128)
>>> show_images([image1, image2] * 5, columns=4, border=True)
Arguments:
  • images: Iterable of images, or dictionary of {title: image}. Each image must be either a 2D array or a 3D array with 1, 3, or 4 channels.
  • titles: Optional strings shown above the corresponding images.
  • width: Optional, overrides displayed width (in pixels).
  • height: Optional, overrides displayed height (in pixels).
  • downsample: If True, each image whose width or height is greater than the specified width or height is resampled to the display resolution. This improves antialiasing and reduces the size of the notebook.
  • columns: Optional, maximum number of images per row.
  • vmin: For single-channel image, explicit min value for display.
  • vmax: For single-channel image, explicit max value for display.
  • cmap: For single-channel image, pyplot color map or callable to map 1D to 3D color.
  • border: If bool, whether to place a black boundary around the image, or if str, the boundary CSS style.
  • ylabel: Text (rotated by 90 degrees) shown on the left of each row.
  • html_class: CSS class name used in definition of HTML element.
  • pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if False, sets 'image-rendering: auto'; if None, uses pixelated rendering only on images for which width or height introduces magnification.
  • return_html: If True return the raw HTML str instead of displaying.
Returns:

html string if return_html is True.

def compare_images( images: Iterable[ArrayLike], *, vmin: float | None = None, vmax: float | None = None, cmap: str | Callable[[ArrayLike], np.ndarray] = 'gray') -> None:
1127def compare_images(
1128    images: Iterable[_ArrayLike],
1129    *,
1130    vmin: float | None = None,
1131    vmax: float | None = None,
1132    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
1133) -> None:
1134  """Compare two images using an interactive slider.
1135
1136  Displays an HTML slider component to interactively swipe between two images.
1137  The slider functionality requires that the web browser have Internet access.
1138  See additional info in `https://github.com/sneas/img-comparison-slider`.
1139
1140  >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
1141  >>> compare_images([image1, image2])
1142
1143  Args:
1144    images: Iterable of images.  Each image must be either a 2D array or a 3D
1145      array with 1, 3, or 4 channels.  There must be exactly two images.
1146    vmin: For single-channel image, explicit min value for display.
1147    vmax: For single-channel image, explicit max value for display.
1148    cmap: For single-channel image, `pyplot` color map or callable to map 1D to
1149      3D color.
1150  """
1151  list_images = [
1152      _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
1153      for image in images
1154  ]
1155  if len(list_images) != 2:
1156    raise ValueError('The number of images must be 2.')
1157  png_datas = [compress_image(to_uint8(image)) for image in list_images]
1158  b64_1, b64_2 = [
1159      base64.b64encode(png_data).decode('utf-8') for png_data in png_datas
1160  ]
1161  s = _IMAGE_COMPARISON_HTML.replace('{b64_1}', b64_1).replace('{b64_2}', b64_2)
1162  _display_html(s)

Compare two images using an interactive slider.

Displays an HTML slider component to interactively swipe between two images. The slider functionality requires that the web browser have Internet access. See additional info in https://github.com/sneas/img-comparison-slider.

>>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
>>> compare_images([image1, image2])
Arguments:
  • images: Iterable of images. Each image must be either a 2D array or a 3D array with 1, 3, or 4 channels. There must be exactly two images.
  • vmin: For single-channel image, explicit min value for display.
  • vmax: For single-channel image, explicit max value for display.
  • cmap: For single-channel image, pyplot color map or callable to map 1D to 3D color.
def show_video( images: Iterable[np.ndarray], *, title: str | None = None, **kwargs: Any) -> str | None:
1939def show_video(
1940    images: Iterable[_NDArray], *, title: str | None = None, **kwargs: Any
1941) -> str | None:
1942  """Displays a video in the IPython notebook and optionally saves it to a file.
1943
1944  See `show_videos`.
1945
1946  >>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4')
1947  >>> show_video(video, title='River video')
1948
1949  >>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True)
1950
1951  >>> show_video(read_video('/tmp/river.mp4'))
1952
1953  Args:
1954    images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D
1955      arrays).
1956    title: Optional text shown centered above the video.
1957    **kwargs: See `show_videos`.
1958
1959  Returns:
1960    html string if `return_html` is `True`.
1961  """
1962  return show_videos([images], [title], **kwargs)

Displays a video in the IPython notebook and optionally saves it to a file.

See show_videos.

>>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4')
>>> show_video(video, title='River video')
>>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True)
>>> show_video(read_video('/tmp/river.mp4'))
Arguments:
  • images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D arrays).
  • title: Optional text shown centered above the video.
  • **kwargs: See show_videos.
Returns:

html string if return_html is True.

def show_videos( videos: Iterable[Iterable[np.ndarray]] | Mapping[str, Iterable[np.ndarray]], titles: Iterable[str | None] | None = None, *, width: int | None = None, height: int | None = None, downsample: bool = True, columns: int | None = None, fps: float | None = None, bps: int | None = None, qp: int | None = None, codec: str = 'h264', ylabel: str = '', html_class: str = 'show_videos', return_html: bool = False, **kwargs: Any) -> str | None:
1965def show_videos(
1966    videos: Iterable[Iterable[_NDArray]] | Mapping[str, Iterable[_NDArray]],
1967    titles: Iterable[str | None] | None = None,
1968    *,
1969    width: int | None = None,
1970    height: int | None = None,
1971    downsample: bool = True,
1972    columns: int | None = None,
1973    fps: float | None = None,
1974    bps: int | None = None,
1975    qp: int | None = None,
1976    codec: str = 'h264',
1977    ylabel: str = '',
1978    html_class: str = 'show_videos',
1979    return_html: bool = False,
1980    **kwargs: Any,
1981) -> str | None:
1982  """Displays a row of videos in the IPython notebook.
1983
1984  Creates HTML with `<video>` tags containing embedded H264-encoded bytestrings.
1985  If `codec` is set to 'gif', we instead use `<img>` tags containing embedded
1986  GIF-encoded bytestrings.  Note that the resulting GIF animations skip frames
1987  when the `fps` period is not a multiple of 10 ms units (GIF frame delay
1988  units).  Encoding at `fps` = 20.0, 25.0, or 50.0 works fine.
1989
1990  If a directory has been specified using `set_show_save_dir`, also saves each
1991  titled video to a file in that directory based on its title.
1992
1993  Args:
1994    videos: Iterable of videos, or dictionary of `{title: video}`.  Each video
1995      must be an iterable of images.  If a video object has a `metadata`
1996      (`VideoMetadata`) attribute, its `fps` field provides a default framerate.
1997    titles: Optional strings shown above the corresponding videos.
1998    width: Optional, overrides displayed width (in pixels).
1999    height: Optional, overrides displayed height (in pixels).
2000    downsample: If True, each video whose width or height is greater than the
2001      specified `width` or `height` is resampled to the display resolution. This
2002      improves antialiasing and reduces the size of the notebook.
2003    columns: Optional, maximum number of videos per row.
2004    fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF).
2005    bps: Bits-per-second bitrate (default None).
2006    qp: Quantization parameter for video compression quality (default None).
2007    codec: Compression algorithm; must be either 'h264' or 'gif'.
2008    ylabel: Text (rotated by 90 degrees) shown on the left of each row.
2009    html_class: CSS class name used in definition of HTML element.
2010    return_html: If `True` return the raw HTML `str` instead of displaying.
2011    **kwargs: Additional parameters (`border`, `loop`, `autoplay`) for
2012      `html_from_compressed_video`.
2013
2014  Returns:
2015    html string if `return_html` is `True`.
2016  """
2017  if isinstance(videos, Mapping):
2018    if titles is not None:
2019      raise ValueError(
2020          'Cannot have both a video dictionary and a titles parameter.'
2021      )
2022    list_titles = list(videos.keys())
2023    list_videos = list(videos.values())
2024  else:
2025    list_videos = list(cast('Iterable[_NDArray]', videos))
2026    list_titles = [None] * len(list_videos) if titles is None else list(titles)
2027    if len(list_videos) != len(list_titles):
2028      raise ValueError(
2029          'Number of videos does not match number of titles'
2030          f' ({len(list_videos)} vs {len(list_titles)}).'
2031      )
2032  if codec not in {'h264', 'gif'}:
2033    raise ValueError(f'Codec {codec} is neither h264 or gif.')
2034
2035  html_strings = []
2036  for video, title in zip(list_videos, list_titles):
2037    metadata: VideoMetadata | None = getattr(video, 'metadata', None)
2038    first_image, video = _peek_first(video)
2039    w, h = _get_width_height(width, height, first_image.shape[:2])
2040    if downsample and (w < first_image.shape[1] or h < first_image.shape[0]):
2041      # Not resize_video() because each image may have different depth and type.
2042      video = [resize_image(image, (h, w)) for image in video]
2043      first_image = video[0]
2044    data = compress_video(
2045        video, metadata=metadata, fps=fps, bps=bps, qp=qp, codec=codec
2046    )
2047    if title is not None and _config.show_save_dir:
2048      suffix = _filename_suffix_from_codec(codec)
2049      path = pathlib.Path(_config.show_save_dir) / f'{title}{suffix}'
2050      with _open(path, mode='wb') as f:
2051        f.write(data)
2052    if codec == 'gif':
2053      pixelated = h > first_image.shape[0] or w > first_image.shape[1]
2054      html_string = html_from_compressed_image(
2055          data, w, h, title=title, fmt='gif', pixelated=pixelated, **kwargs
2056      )
2057    else:
2058      html_string = html_from_compressed_video(
2059          data, w, h, title=title, **kwargs
2060      )
2061    html_strings.append(html_string)
2062
2063  # Create single-row tables each with no more than 'columns' elements.
2064  table_strings = []
2065  for row_html_strings in _chunked(html_strings, columns):
2066    td = '<td style="padding:1px;">'
2067    s = ''.join(f'{td}{e}</td>' for e in row_html_strings)
2068    if ylabel:
2069      style = 'writing-mode:vertical-lr; transform:rotate(180deg);'
2070      s = f'{td}<span style="{style}">{ylabel}</span></td>' + s
2071    table_strings.append(
2072        f'<table class="{html_class}"'
2073        f' style="border-spacing:0px;"><tr>{s}</tr></table>'
2074    )
2075  s = ''.join(table_strings)
2076  if return_html:
2077    return s
2078  _display_html(s)
2079  return None

Displays a row of videos in the IPython notebook.

Creates HTML with <video> tags containing embedded H264-encoded bytestrings. If codec is set to 'gif', we instead use <img> tags containing embedded GIF-encoded bytestrings. Note that the resulting GIF animations skip frames when the fps period is not a multiple of 10 ms units (GIF frame delay units). Encoding at fps = 20.0, 25.0, or 50.0 works fine.

If a directory has been specified using set_show_save_dir, also saves each titled video to a file in that directory based on its title.

Arguments:
  • videos: Iterable of videos, or dictionary of {title: video}. Each video must be an iterable of images. If a video object has a metadata (VideoMetadata) attribute, its fps field provides a default framerate.
  • titles: Optional strings shown above the corresponding videos.
  • width: Optional, overrides displayed width (in pixels).
  • height: Optional, overrides displayed height (in pixels).
  • downsample: If True, each video whose width or height is greater than the specified width or height is resampled to the display resolution. This improves antialiasing and reduces the size of the notebook.
  • columns: Optional, maximum number of videos per row.
  • fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF).
  • bps: Bits-per-second bitrate (default None).
  • qp: Quantization parameter for video compression quality (default None).
  • codec: Compression algorithm; must be either 'h264' or 'gif'.
  • ylabel: Text (rotated by 90 degrees) shown on the left of each row.
  • html_class: CSS class name used in definition of HTML element.
  • return_html: If True return the raw HTML str instead of displaying.
  • **kwargs: Additional parameters (border, loop, autoplay) for html_from_compressed_video.
Returns:

html string if return_html is True.

def read_image( path_or_url: str | os.PathLike[str], *, apply_exif_transpose: bool = True, dtype: DTypeLike = None) -> np.ndarray:
769def read_image(
770    path_or_url: _Path,
771    *,
772    apply_exif_transpose: bool = True,
773    dtype: _DTypeLike = None,
774) -> _NDArray:
775  """Returns an image read from a file path or URL.
776
777  Decoding is performed using `PIL`, which supports `uint8` images with 1, 3,
778  or 4 channels and `uint16` images with a single channel.
779
780  Args:
781    path_or_url: Path of input file.
782    apply_exif_transpose: If True, rotate image according to EXIF orientation.
783    dtype: Data type of the returned array.  If None, `np.uint8` or `np.uint16`
784      is inferred automatically.
785  """
786  data = read_contents(path_or_url)
787  return decompress_image(data, dtype, apply_exif_transpose)

Returns an image read from a file path or URL.

Decoding is performed using PIL, which supports uint8 images with 1, 3, or 4 channels and uint16 images with a single channel.

Arguments:
  • path_or_url: Path of input file.
  • apply_exif_transpose: If True, rotate image according to EXIF orientation.
  • dtype: Data type of the returned array. If None, np.uint8 or np.uint16 is inferred automatically.
def write_image( path: str | os.PathLike[str], image: ArrayLike, fmt: str = 'png', **kwargs: Any) -> None:
790def write_image(
791    path: _Path, image: _ArrayLike, fmt: str = 'png', **kwargs: Any
792) -> None:
793  """Writes an image to a file.
794
795  Encoding is performed using `PIL`, which supports `uint8` images with 1, 3,
796  or 4 channels and `uint16` images with a single channel.
797
798  File format is explicitly provided by `fmt` and not inferred by `path`.
799
800  Args:
801    path: Path of output file.
802    image: Array-like object.  If its type is float, it is converted to np.uint8
803      using `to_uint8` (thus clamping to the input to the range [0.0, 1.0]).
804      Otherwise it must be np.uint8 or np.uint16.
805    fmt: Desired compression encoding, e.g. 'png'.
806    **kwargs: Additional parameters for `PIL.Image.save()`.
807  """
808  image = _as_valid_media_array(image)
809  if np.issubdtype(image.dtype, np.floating):
810    image = to_uint8(image)
811  with _open(path, 'wb') as f:
812    _pil_image(image).save(f, format=fmt, **kwargs)

Writes an image to a file.

Encoding is performed using PIL, which supports uint8 images with 1, 3, or 4 channels and uint16 images with a single channel.

File format is explicitly provided by fmt and not inferred by path.

Arguments:
  • path: Path of output file.
  • image: Array-like object. If its type is float, it is converted to np.uint8 using to_uint8 (thus clamping to the input to the range [0.0, 1.0]). Otherwise it must be np.uint8 or np.uint16.
  • fmt: Desired compression encoding, e.g. 'png'.
  • **kwargs: Additional parameters for PIL.Image.save().
def read_video( path_or_url: str | os.PathLike[str], **kwargs: Any) -> mediapy._VideoArray:
1804def read_video(path_or_url: _Path, **kwargs: Any) -> _VideoArray:
1805  """Returns an array containing all images read from a compressed video file.
1806
1807  >>> video = read_video('/tmp/river.mp4')
1808  >>> print(f'The framerate is {video.metadata.fps} frames/s.')
1809  >>> show_video(video)
1810
1811  >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
1812  >>> show_video(read_video(url))
1813
1814  Args:
1815    path_or_url: Input video file.
1816    **kwargs: Additional parameters for `VideoReader`.
1817
1818  Returns:
1819    A 4D `numpy` array with dimensions (frame, height, width, channel), or a 3D
1820    array if `output_format` is specified as 'gray'.  The returned array has an
1821    attribute `metadata` containing `VideoMetadata` information.  This enables
1822    `show_video` to retrieve the framerate in `metadata.fps`.  Note that the
1823    metadata attribute is lost in most subsequent `numpy` operations.
1824  """
1825  with VideoReader(path_or_url, **kwargs) as reader:
1826    return _VideoArray(np.array(tuple(reader)), metadata=reader.metadata)

Returns an array containing all images read from a compressed video file.

>>> video = read_video('/tmp/river.mp4')
>>> print(f'The framerate is {video.metadata.fps} frames/s.')
>>> show_video(video)
>>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
>>> show_video(read_video(url))
Arguments:
  • path_or_url: Input video file.
  • **kwargs: Additional parameters for VideoReader.
Returns:

A 4D numpy array with dimensions (frame, height, width, channel), or a 3D array if output_format is specified as 'gray'. The returned array has an attribute metadata containing VideoMetadata information. This enables show_video to retrieve the framerate in metadata.fps. Note that the metadata attribute is lost in most subsequent numpy operations.

def write_video( path: str | os.PathLike[str], images: Iterable[np.ndarray], **kwargs: Any) -> None:
1829def write_video(path: _Path, images: Iterable[_NDArray], **kwargs: Any) -> None:
1830  """Writes images to a compressed video file.
1831
1832  >>> video = moving_circle((480, 640), num_images=60)
1833  >>> write_video('/tmp/v.mp4', video, fps=60, qp=18)
1834  >>> show_video(read_video('/tmp/v.mp4'))
1835
1836  Args:
1837    path: Output video file.
1838    images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D
1839      arrays.
1840    **kwargs: Additional parameters for `VideoWriter`.
1841  """
1842  first_image, images = _peek_first(images)
1843  shape = first_image.shape[0], first_image.shape[1]
1844  dtype = first_image.dtype
1845  if dtype == bool:
1846    dtype = np.dtype(np.uint8)
1847  elif np.issubdtype(dtype, np.floating):
1848    dtype = np.dtype(np.uint16)
1849  kwargs = {'metadata': getattr(images, 'metadata', None), **kwargs}
1850  with VideoWriter(path, shape=shape, dtype=dtype, **kwargs) as writer:
1851    for image in images:
1852      writer.add_image(image)

Writes images to a compressed video file.

>>> video = moving_circle((480, 640), num_images=60)
>>> write_video('/tmp/v.mp4', video, fps=60, qp=18)
>>> show_video(read_video('/tmp/v.mp4'))
Arguments:
  • path: Output video file.
  • images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D arrays.
  • **kwargs: Additional parameters for VideoWriter.
class VideoReader(_VideoIO):
1370class VideoReader(_VideoIO):
1371  """Context to read a compressed video as an iterable over its images.
1372
1373  >>> with VideoReader('/tmp/river.mp4') as reader:
1374  ...   print(f'Video has {reader.num_images} images with shape={reader.shape},'
1375  ...         f' at {reader.fps} frames/sec and {reader.bps} bits/sec.')
1376  ...   for image in reader:
1377  ...     print(image.shape)
1378
1379  >>> with VideoReader('/tmp/river.mp4') as reader:
1380  ...   video = np.array(tuple(reader))
1381
1382  >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
1383  >>> with VideoReader(url) as reader:
1384  ...   show_video(reader)
1385
1386  Attributes:
1387    path_or_url: Location of input video.
1388    output_format: Format of output images (default 'rgb').  If 'rgb', each
1389      image has shape=(height, width, 3) with R, G, B values.  If 'yuv', each
1390      image has shape=(height, width, 3) with Y, U, V values.  If 'gray', each
1391      image has shape=(height, width).
1392    dtype: Data type for output images.  The default is `np.uint8`.  Use of
1393      `np.uint16` allows reading 10-bit or 12-bit data without precision loss.
1394    metadata: Object storing the information retrieved from the video header.
1395      Its attributes are copied as attributes in this class.
1396    num_images: Number of frames that is expected from the video stream.  This
1397      is estimated from the framerate and the duration stored in the video
1398      header, so it might be inexact.
1399    shape: The dimensions (height, width) of each video frame.
1400    fps: The framerate in frames per second.
1401    bps: The estimated bitrate of the video stream in bits per second, retrieved
1402      from the video header.
1403    stream_index: The stream index to read from. The default is 0.
1404  """
1405
1406  path_or_url: _Path
1407  output_format: str
1408  dtype: _DType
1409  metadata: VideoMetadata
1410  num_images: int
1411  shape: tuple[int, int]
1412  fps: float
1413  bps: int | None
1414  stream_index: int
1415  _num_bytes_per_image: int
1416
1417  def __init__(
1418      self,
1419      path_or_url: _Path,
1420      *,
1421      stream_index: int = 0,
1422      output_format: str = 'rgb',
1423      dtype: _DTypeLike = np.uint8,
1424  ):
1425    if output_format not in {'rgb', 'yuv', 'gray'}:
1426      raise ValueError(
1427          f'Output format {output_format} is not rgb, yuv, or gray.'
1428      )
1429    self.path_or_url = path_or_url
1430    self.output_format = output_format
1431    self.stream_index = stream_index
1432    self.dtype = np.dtype(dtype)
1433    if self.dtype.type not in (np.uint8, np.uint16):
1434      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1435    self._read_via_local_file: Any = None
1436    self._popen: subprocess.Popen[bytes] | None = None
1437    self._proc: subprocess.Popen[bytes] | None = None
1438
1439  def __enter__(self) -> 'VideoReader':
1440    try:
1441      self._read_via_local_file = _read_via_local_file(self.path_or_url)
1442      # pylint: disable-next=no-member
1443      tmp_name = self._read_via_local_file.__enter__()
1444
1445      self.metadata = _get_video_metadata(tmp_name)
1446      self.num_images, self.shape, self.fps, self.bps = self.metadata
1447      pix_fmt = self._get_pix_fmt(self.dtype, self.output_format)
1448      num_channels = {'rgb': 3, 'yuv': 3, 'gray': 1}[self.output_format]
1449      bytes_per_channel = self.dtype.itemsize
1450      self._num_bytes_per_image = (
1451          math.prod(self.shape) * num_channels * bytes_per_channel
1452      )
1453
1454      command = [
1455          '-v',
1456          'panic',
1457          '-nostdin',
1458          '-i',
1459          tmp_name,
1460          '-vcodec',
1461          'rawvideo',
1462          '-f',
1463          'image2pipe',
1464          '-map',
1465          f'0:v:{self.stream_index}',
1466          '-pix_fmt',
1467          pix_fmt,
1468          '-vsync',
1469          'vfr',
1470          '-',
1471      ]
1472      self._popen = _run_ffmpeg(
1473          command,
1474          stdout=subprocess.PIPE,
1475          stderr=subprocess.PIPE,
1476          allowed_input_files=[tmp_name],
1477      )
1478      self._proc = self._popen.__enter__()
1479    except Exception:
1480      self.__exit__(None, None, None)
1481      raise
1482    return self
1483
1484  def __exit__(self, *_: Any) -> None:
1485    self.close()
1486
1487  def read(self) -> _NDArray | None:
1488    """Reads a video image frame (or None if at end of file).
1489
1490    Returns:
1491      A numpy array in the format specified by `output_format`, i.e., a 3D
1492      array with 3 color channels, except for format 'gray' which is 2D.
1493    """
1494    assert self._proc, 'Error: reading from an already closed context.'
1495    stdout = self._proc.stdout
1496    assert stdout is not None
1497    data = stdout.read(self._num_bytes_per_image)
1498    if not data:  # Due to either end-of-file or subprocess error.
1499      self.close()  # Raises exception if subprocess had error.
1500      return None  # To indicate end-of-file.
1501    assert len(data) == self._num_bytes_per_image
1502    image = np.frombuffer(data, dtype=self.dtype)
1503    if self.output_format == 'rgb':
1504      image = image.reshape(*self.shape, 3)
1505    elif self.output_format == 'yuv':  # Convert from planar YUV to pixel YUV.
1506      image = np.moveaxis(image.reshape(3, *self.shape), 0, 2)
1507    elif self.output_format == 'gray':  # Generate 2D rather than 3D ndimage.
1508      image = image.reshape(*self.shape)
1509    else:
1510      raise AssertionError
1511    return image
1512
1513  def __iter__(self) -> Iterator[_NDArray]:
1514    while True:
1515      image = self.read()
1516      if image is None:
1517        return
1518      yield image
1519
1520  def close(self) -> None:
1521    """Terminates video reader.  (Called automatically at end of context.)"""
1522    if self._popen:
1523      self._popen.__exit__(None, None, None)
1524      self._popen = None
1525      self._proc = None
1526    if self._read_via_local_file:
1527      # pylint: disable-next=no-member
1528      self._read_via_local_file.__exit__(None, None, None)
1529      self._read_via_local_file = None

Context to read a compressed video as an iterable over its images.

>>> with VideoReader('/tmp/river.mp4') as reader:
...   print(f'Video has {reader.num_images} images with shape={reader.shape},'
...         f' at {reader.fps} frames/sec and {reader.bps} bits/sec.')
...   for image in reader:
...     print(image.shape)
>>> with VideoReader('/tmp/river.mp4') as reader:
...   video = np.array(tuple(reader))
>>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
>>> with VideoReader(url) as reader:
...   show_video(reader)
Attributes:
  • path_or_url: Location of input video.
  • output_format: Format of output images (default 'rgb'). If 'rgb', each image has shape=(height, width, 3) with R, G, B values. If 'yuv', each image has shape=(height, width, 3) with Y, U, V values. If 'gray', each image has shape=(height, width).
  • dtype: Data type for output images. The default is np.uint8. Use of np.uint16 allows reading 10-bit or 12-bit data without precision loss.
  • metadata: Object storing the information retrieved from the video header. Its attributes are copied as attributes in this class.
  • num_images: Number of frames that is expected from the video stream. This is estimated from the framerate and the duration stored in the video header, so it might be inexact.
  • shape: The dimensions (height, width) of each video frame.
  • fps: The framerate in frames per second.
  • bps: The estimated bitrate of the video stream in bits per second, retrieved from the video header.
  • stream_index: The stream index to read from. The default is 0.
VideoReader( path_or_url: str | os.PathLike[str], *, stream_index: int = 0, output_format: str = 'rgb', dtype: DTypeLike = <class 'numpy.uint8'>)
1417  def __init__(
1418      self,
1419      path_or_url: _Path,
1420      *,
1421      stream_index: int = 0,
1422      output_format: str = 'rgb',
1423      dtype: _DTypeLike = np.uint8,
1424  ):
1425    if output_format not in {'rgb', 'yuv', 'gray'}:
1426      raise ValueError(
1427          f'Output format {output_format} is not rgb, yuv, or gray.'
1428      )
1429    self.path_or_url = path_or_url
1430    self.output_format = output_format
1431    self.stream_index = stream_index
1432    self.dtype = np.dtype(dtype)
1433    if self.dtype.type not in (np.uint8, np.uint16):
1434      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1435    self._read_via_local_file: Any = None
1436    self._popen: subprocess.Popen[bytes] | None = None
1437    self._proc: subprocess.Popen[bytes] | None = None
path_or_url: str | os.PathLike[str]
output_format: str
dtype: ~_DType
metadata: VideoMetadata
num_images: int
shape: tuple[int, int]
fps: float
bps: int | None
stream_index: int
def read(self) -> np.ndarray | None:
1487  def read(self) -> _NDArray | None:
1488    """Reads a video image frame (or None if at end of file).
1489
1490    Returns:
1491      A numpy array in the format specified by `output_format`, i.e., a 3D
1492      array with 3 color channels, except for format 'gray' which is 2D.
1493    """
1494    assert self._proc, 'Error: reading from an already closed context.'
1495    stdout = self._proc.stdout
1496    assert stdout is not None
1497    data = stdout.read(self._num_bytes_per_image)
1498    if not data:  # Due to either end-of-file or subprocess error.
1499      self.close()  # Raises exception if subprocess had error.
1500      return None  # To indicate end-of-file.
1501    assert len(data) == self._num_bytes_per_image
1502    image = np.frombuffer(data, dtype=self.dtype)
1503    if self.output_format == 'rgb':
1504      image = image.reshape(*self.shape, 3)
1505    elif self.output_format == 'yuv':  # Convert from planar YUV to pixel YUV.
1506      image = np.moveaxis(image.reshape(3, *self.shape), 0, 2)
1507    elif self.output_format == 'gray':  # Generate 2D rather than 3D ndimage.
1508      image = image.reshape(*self.shape)
1509    else:
1510      raise AssertionError
1511    return image

Reads a video image frame (or None if at end of file).

Returns:

A numpy array in the format specified by output_format, i.e., a 3D array with 3 color channels, except for format 'gray' which is 2D.

def close(self) -> None:
1520  def close(self) -> None:
1521    """Terminates video reader.  (Called automatically at end of context.)"""
1522    if self._popen:
1523      self._popen.__exit__(None, None, None)
1524      self._popen = None
1525      self._proc = None
1526    if self._read_via_local_file:
1527      # pylint: disable-next=no-member
1528      self._read_via_local_file.__exit__(None, None, None)
1529      self._read_via_local_file = None

Terminates video reader. (Called automatically at end of context.)

class VideoWriter(_VideoIO):
1532class VideoWriter(_VideoIO):
1533  """Context to write a compressed video.
1534
1535  >>> shape = 480, 640
1536  >>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer:
1537  ...   for image in moving_circle(shape, num_images=60):
1538  ...     writer.add_image(image)
1539  >>> show_video(read_video('/tmp/v.mp4'))
1540
1541
1542  Bitrate control may be specified using at most one of: `bps`, `qp`, or `crf`.
1543  If none are specified, `qp` is set to a default value.
1544  See https://slhck.info/video/2017/03/01/rate-control.html
1545
1546  If codec is 'gif', the args `bps`, `qp`, `crf`, and `encoded_format` are
1547  ignored.
1548
1549  Attributes:
1550    path: Output video.  Its suffix (e.g. '.mp4') determines the video container
1551      format.  The suffix must be '.gif' if the codec is 'gif'.
1552    shape: 2D spatial dimensions (height, width) of video image frames.  The
1553      dimensions must be even if 'encoded_format' has subsampled chroma (e.g.,
1554      'yuv420p' or 'yuv420p10le').
1555    codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264',
1556      'hevc', 'vp9', or 'gif').
1557    metadata: Optional VideoMetadata object whose `fps` and `bps` attributes are
1558      used if not specified as explicit parameters.
1559    fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif').
1560    bps: Requested average bits-per-second bitrate (default None).
1561    qp: Quantization parameter for video compression quality (default None).
1562    crf: Constant rate factor for video compression quality (default None).
1563    ffmpeg_args: Additional arguments for `ffmpeg` command, e.g. '-g 30' to
1564      introduce I-frames, or '-bf 0' to omit B-frames.
1565    input_format: Format of input images (default 'rgb').  If 'rgb', each image
1566      has shape=(height, width, 3) or (height, width).  If 'yuv', each image has
1567      shape=(height, width, 3) with Y, U, V values.  If 'gray', each image has
1568      shape=(height, width).
1569    dtype: Expected data type for input images (any float input images are
1570      converted to `dtype`).  The default is `np.uint8`.  Use of `np.uint16` is
1571      necessary when encoding >8 bits/channel.
1572    encoded_format: Pixel format as defined by `ffmpeg -pix_fmts`, e.g.,
1573      'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma),
1574      'yuv420p10le' (10-bit per channel), etc.  The default (None) selects
1575      'yuv420p' if all shape dimensions are even, else 'yuv444p'.
1576  """
1577
1578  def __init__(
1579      self,
1580      path: _Path,
1581      shape: tuple[int, int],
1582      *,
1583      codec: str = 'h264',
1584      metadata: VideoMetadata | None = None,
1585      fps: float | None = None,
1586      bps: int | None = None,
1587      qp: int | None = None,
1588      crf: float | None = None,
1589      ffmpeg_args: str | Sequence[str] = '',
1590      input_format: str = 'rgb',
1591      dtype: _DTypeLike = np.uint8,
1592      encoded_format: str | None = None,
1593  ) -> None:
1594    _check_2d_shape(shape)
1595    if fps is None and metadata:
1596      fps = metadata.fps
1597    if fps is None:
1598      fps = 25.0 if codec == 'gif' else 60.0
1599    if fps <= 0.0:
1600      raise ValueError(f'Frame-per-second value {fps} is invalid.')
1601    if bps is None and metadata:
1602      bps = metadata.bps
1603    bps = int(bps) if bps is not None else None
1604    if bps is not None and bps <= 0:
1605      raise ValueError(f'Bitrate value {bps} is invalid.')
1606    if qp is not None and (not isinstance(qp, int) or qp < 0):
1607      raise ValueError(
1608          f'Quantization parameter {qp} cannot be negative. It must be a'
1609          ' non-negative integer.'
1610      )
1611    num_rate_specifications = sum(x is not None for x in (bps, qp, crf))
1612    if num_rate_specifications > 1:
1613      raise ValueError(
1614          f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).'
1615      )
1616    ffmpeg_args = (
1617        shlex.split(ffmpeg_args)
1618        if isinstance(ffmpeg_args, str)
1619        else list(ffmpeg_args)
1620    )
1621    if input_format not in {'rgb', 'yuv', 'gray'}:
1622      raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.')
1623    dtype = np.dtype(dtype)
1624    if dtype.type not in (np.uint8, np.uint16):
1625      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1626    self.path = pathlib.Path(path)
1627    self.shape = shape
1628    all_dimensions_are_even = all(dim % 2 == 0 for dim in shape)
1629    if encoded_format is None:
1630      encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p'
1631    if not all_dimensions_are_even and encoded_format.startswith(
1632        ('yuv42', 'yuvj42')
1633    ):
1634      raise ValueError(
1635          f'With encoded_format {encoded_format}, video dimensions must be'
1636          f' even, but shape is {shape}.'
1637      )
1638    self.fps = fps
1639    self.codec = codec
1640    self.bps = bps
1641    self.qp = qp
1642    self.crf = crf
1643    self.ffmpeg_args = ffmpeg_args
1644    self.input_format = input_format
1645    self.dtype = dtype
1646    self.encoded_format = encoded_format
1647    if num_rate_specifications == 0 and not ffmpeg_args:
1648      qp = 20 if math.prod(self.shape) <= 640 * 480 else 28
1649    self._bitrate_args = (
1650        (['-vb', f'{bps}'] if bps is not None else [])
1651        + (['-qp', f'{qp}'] if qp is not None else [])
1652        + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else [])
1653    )
1654    if self.codec == 'gif':
1655      if self.path.suffix != '.gif':
1656        raise ValueError(f"File '{self.path}' does not have a .gif suffix.")
1657      self.encoded_format = 'pal8'
1658      self._bitrate_args = []
1659      video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse'
1660      # Less common (and likely less useful) is a per-frame color palette:
1661      # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];'
1662      #                 '[s1][p]paletteuse=new=1')
1663      self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args
1664    self._write_via_local_file: Any = None
1665    self._popen: subprocess.Popen[bytes] | None = None
1666    self._proc: subprocess.Popen[bytes] | None = None
1667
1668  def __enter__(self) -> 'VideoWriter':
1669    input_pix_fmt = self._get_pix_fmt(self.dtype, self.input_format)
1670    try:
1671      self._write_via_local_file = _write_via_local_file(self.path)
1672      # pylint: disable-next=no-member
1673      tmp_name = self._write_via_local_file.__enter__()
1674
1675      # Writing to stdout using ('-f', 'mp4', '-') would require
1676      # ('-movflags', 'frag_keyframe+empty_moov') and the result is nonportable.
1677      height, width = self.shape
1678      command = (
1679          [
1680              '-v',
1681              'error',
1682              '-f',
1683              'rawvideo',
1684              '-vcodec',
1685              'rawvideo',
1686              '-pix_fmt',
1687              input_pix_fmt,
1688              '-s',
1689              f'{width}x{height}',
1690              '-r',
1691              f'{self.fps}',
1692              '-i',
1693              '-',
1694              '-an',
1695              '-vcodec',
1696              self.codec,
1697              '-pix_fmt',
1698              self.encoded_format,
1699          ]
1700          + self._bitrate_args
1701          + self.ffmpeg_args
1702          + ['-y', tmp_name]
1703      )
1704      self._popen = _run_ffmpeg(
1705          command,
1706          stdin=subprocess.PIPE,
1707          stderr=subprocess.PIPE,
1708          allowed_output_files=[tmp_name],
1709      )
1710      self._proc = self._popen.__enter__()
1711    except Exception:
1712      self.__exit__(None, None, None)
1713      raise
1714    return self
1715
1716  def __exit__(self, *_: Any) -> None:
1717    self.close()
1718
1719  def add_image(self, image: _NDArray) -> None:
1720    """Writes a video frame.
1721
1722    Args:
1723      image: Array whose dtype and first two dimensions must match the `dtype`
1724        and `shape` specified in `VideoWriter` initialization.  If
1725        `input_format` is 'gray', the image must be 2D.  For the 'rgb'
1726        input_format, the image may be either 2D (interpreted as grayscale) or
1727        3D with three (R, G, B) channels.  For the 'yuv' input_format, the image
1728        must be 3D with three (Y, U, V) channels.
1729
1730    Raises:
1731      RuntimeError: If there is an error writing to the output file.
1732    """
1733    assert self._proc, 'Error: writing to an already closed context.'
1734    if issubclass(image.dtype.type, (np.floating, np.bool_)):
1735      image = to_type(image, self.dtype)
1736    if image.dtype != self.dtype:
1737      raise ValueError(f'Image type {image.dtype} != {self.dtype}.')
1738    if self.input_format == 'gray':
1739      if image.ndim != 2:
1740        raise ValueError(f'Image dimensions {image.shape} are not 2D.')
1741    else:
1742      if image.ndim == 2 and self.input_format == 'rgb':
1743        image = np.dstack((image, image, image))
1744      if not (image.ndim == 3 and image.shape[2] == 3):
1745        raise ValueError(f'Image dimensions {image.shape} are invalid.')
1746    if image.shape[:2] != self.shape:
1747      raise ValueError(
1748          f'Image dimensions {image.shape[:2]} do not match'
1749          f' those of the initialized video {self.shape}.'
1750      )
1751    if self.input_format == 'yuv':  # Convert from per-pixel YUV to planar YUV.
1752      image = np.moveaxis(image, 2, 0)
1753    data = image.tobytes()
1754    stdin = self._proc.stdin
1755    assert stdin is not None
1756    if stdin.write(data) != len(data):
1757      self._proc.wait()
1758      stderr = self._proc.stderr
1759      assert stderr is not None
1760      s = stderr.read().decode('utf-8')
1761      raise RuntimeError(f"Error writing '{self.path}': {s}")
1762
1763  def close(self) -> None:
1764    """Finishes writing the video.  (Called automatically at end of context.)"""
1765    if self._popen:
1766      assert self._proc, 'Error: closing an already closed context.'
1767      stdin = self._proc.stdin
1768      assert stdin is not None
1769      stdin.close()
1770      if self._proc.wait():
1771        stderr = self._proc.stderr
1772        assert stderr is not None
1773        s = stderr.read().decode('utf-8')
1774        raise RuntimeError(f"Error writing '{self.path}': {s}")
1775      self._popen.__exit__(None, None, None)
1776      self._popen = None
1777      self._proc = None
1778    if self._write_via_local_file:
1779      # pylint: disable-next=no-member
1780      self._write_via_local_file.__exit__(None, None, None)
1781      self._write_via_local_file = None

Context to write a compressed video.

>>> shape = 480, 640
>>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer:
...   for image in moving_circle(shape, num_images=60):
...     writer.add_image(image)
>>> show_video(read_video('/tmp/v.mp4'))

Bitrate control may be specified using at most one of: bps, qp, or crf. If none are specified, qp is set to a default value. See https://slhck.info/video/2017/03/01/rate-control.html

If codec is 'gif', the args bps, qp, crf, and encoded_format are ignored.

Attributes:
  • path: Output video. Its suffix (e.g. '.mp4') determines the video container format. The suffix must be '.gif' if the codec is 'gif'.
  • shape: 2D spatial dimensions (height, width) of video image frames. The dimensions must be even if 'encoded_format' has subsampled chroma (e.g., 'yuv420p' or 'yuv420p10le').
  • codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264', 'hevc', 'vp9', or 'gif').
  • metadata: Optional VideoMetadata object whose fps and bps attributes are used if not specified as explicit parameters.
  • fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif').
  • bps: Requested average bits-per-second bitrate (default None).
  • qp: Quantization parameter for video compression quality (default None).
  • crf: Constant rate factor for video compression quality (default None).
  • ffmpeg_args: Additional arguments for ffmpeg command, e.g. '-g 30' to introduce I-frames, or '-bf 0' to omit B-frames.
  • input_format: Format of input images (default 'rgb'). If 'rgb', each image has shape=(height, width, 3) or (height, width). If 'yuv', each image has shape=(height, width, 3) with Y, U, V values. If 'gray', each image has shape=(height, width).
  • dtype: Expected data type for input images (any float input images are converted to dtype). The default is np.uint8. Use of np.uint16 is necessary when encoding >8 bits/channel.
  • encoded_format: Pixel format as defined by ffmpeg -pix_fmts, e.g., 'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma), 'yuv420p10le' (10-bit per channel), etc. The default (None) selects 'yuv420p' if all shape dimensions are even, else 'yuv444p'.
VideoWriter( path: str | os.PathLike[str], shape: tuple[int, int], *, codec: str = 'h264', metadata: VideoMetadata | None = None, fps: float | None = None, bps: int | None = None, qp: int | None = None, crf: float | None = None, ffmpeg_args: str | Sequence[str] = '', input_format: str = 'rgb', dtype: DTypeLike = <class 'numpy.uint8'>, encoded_format: str | None = None)
1578  def __init__(
1579      self,
1580      path: _Path,
1581      shape: tuple[int, int],
1582      *,
1583      codec: str = 'h264',
1584      metadata: VideoMetadata | None = None,
1585      fps: float | None = None,
1586      bps: int | None = None,
1587      qp: int | None = None,
1588      crf: float | None = None,
1589      ffmpeg_args: str | Sequence[str] = '',
1590      input_format: str = 'rgb',
1591      dtype: _DTypeLike = np.uint8,
1592      encoded_format: str | None = None,
1593  ) -> None:
1594    _check_2d_shape(shape)
1595    if fps is None and metadata:
1596      fps = metadata.fps
1597    if fps is None:
1598      fps = 25.0 if codec == 'gif' else 60.0
1599    if fps <= 0.0:
1600      raise ValueError(f'Frame-per-second value {fps} is invalid.')
1601    if bps is None and metadata:
1602      bps = metadata.bps
1603    bps = int(bps) if bps is not None else None
1604    if bps is not None and bps <= 0:
1605      raise ValueError(f'Bitrate value {bps} is invalid.')
1606    if qp is not None and (not isinstance(qp, int) or qp < 0):
1607      raise ValueError(
1608          f'Quantization parameter {qp} cannot be negative. It must be a'
1609          ' non-negative integer.'
1610      )
1611    num_rate_specifications = sum(x is not None for x in (bps, qp, crf))
1612    if num_rate_specifications > 1:
1613      raise ValueError(
1614          f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).'
1615      )
1616    ffmpeg_args = (
1617        shlex.split(ffmpeg_args)
1618        if isinstance(ffmpeg_args, str)
1619        else list(ffmpeg_args)
1620    )
1621    if input_format not in {'rgb', 'yuv', 'gray'}:
1622      raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.')
1623    dtype = np.dtype(dtype)
1624    if dtype.type not in (np.uint8, np.uint16):
1625      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1626    self.path = pathlib.Path(path)
1627    self.shape = shape
1628    all_dimensions_are_even = all(dim % 2 == 0 for dim in shape)
1629    if encoded_format is None:
1630      encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p'
1631    if not all_dimensions_are_even and encoded_format.startswith(
1632        ('yuv42', 'yuvj42')
1633    ):
1634      raise ValueError(
1635          f'With encoded_format {encoded_format}, video dimensions must be'
1636          f' even, but shape is {shape}.'
1637      )
1638    self.fps = fps
1639    self.codec = codec
1640    self.bps = bps
1641    self.qp = qp
1642    self.crf = crf
1643    self.ffmpeg_args = ffmpeg_args
1644    self.input_format = input_format
1645    self.dtype = dtype
1646    self.encoded_format = encoded_format
1647    if num_rate_specifications == 0 and not ffmpeg_args:
1648      qp = 20 if math.prod(self.shape) <= 640 * 480 else 28
1649    self._bitrate_args = (
1650        (['-vb', f'{bps}'] if bps is not None else [])
1651        + (['-qp', f'{qp}'] if qp is not None else [])
1652        + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else [])
1653    )
1654    if self.codec == 'gif':
1655      if self.path.suffix != '.gif':
1656        raise ValueError(f"File '{self.path}' does not have a .gif suffix.")
1657      self.encoded_format = 'pal8'
1658      self._bitrate_args = []
1659      video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse'
1660      # Less common (and likely less useful) is a per-frame color palette:
1661      # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];'
1662      #                 '[s1][p]paletteuse=new=1')
1663      self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args
1664    self._write_via_local_file: Any = None
1665    self._popen: subprocess.Popen[bytes] | None = None
1666    self._proc: subprocess.Popen[bytes] | None = None
path
shape
fps
codec
bps
qp
crf
ffmpeg_args
input_format
dtype
encoded_format
def add_image(self, image: np.ndarray) -> None:
1719  def add_image(self, image: _NDArray) -> None:
1720    """Writes a video frame.
1721
1722    Args:
1723      image: Array whose dtype and first two dimensions must match the `dtype`
1724        and `shape` specified in `VideoWriter` initialization.  If
1725        `input_format` is 'gray', the image must be 2D.  For the 'rgb'
1726        input_format, the image may be either 2D (interpreted as grayscale) or
1727        3D with three (R, G, B) channels.  For the 'yuv' input_format, the image
1728        must be 3D with three (Y, U, V) channels.
1729
1730    Raises:
1731      RuntimeError: If there is an error writing to the output file.
1732    """
1733    assert self._proc, 'Error: writing to an already closed context.'
1734    if issubclass(image.dtype.type, (np.floating, np.bool_)):
1735      image = to_type(image, self.dtype)
1736    if image.dtype != self.dtype:
1737      raise ValueError(f'Image type {image.dtype} != {self.dtype}.')
1738    if self.input_format == 'gray':
1739      if image.ndim != 2:
1740        raise ValueError(f'Image dimensions {image.shape} are not 2D.')
1741    else:
1742      if image.ndim == 2 and self.input_format == 'rgb':
1743        image = np.dstack((image, image, image))
1744      if not (image.ndim == 3 and image.shape[2] == 3):
1745        raise ValueError(f'Image dimensions {image.shape} are invalid.')
1746    if image.shape[:2] != self.shape:
1747      raise ValueError(
1748          f'Image dimensions {image.shape[:2]} do not match'
1749          f' those of the initialized video {self.shape}.'
1750      )
1751    if self.input_format == 'yuv':  # Convert from per-pixel YUV to planar YUV.
1752      image = np.moveaxis(image, 2, 0)
1753    data = image.tobytes()
1754    stdin = self._proc.stdin
1755    assert stdin is not None
1756    if stdin.write(data) != len(data):
1757      self._proc.wait()
1758      stderr = self._proc.stderr
1759      assert stderr is not None
1760      s = stderr.read().decode('utf-8')
1761      raise RuntimeError(f"Error writing '{self.path}': {s}")

Writes a video frame.

Arguments:
  • image: Array whose dtype and first two dimensions must match the dtype and shape specified in VideoWriter initialization. If input_format is 'gray', the image must be 2D. For the 'rgb' input_format, the image may be either 2D (interpreted as grayscale) or 3D with three (R, G, B) channels. For the 'yuv' input_format, the image must be 3D with three (Y, U, V) channels.
Raises:
  • RuntimeError: If there is an error writing to the output file.
def close(self) -> None:
1763  def close(self) -> None:
1764    """Finishes writing the video.  (Called automatically at end of context.)"""
1765    if self._popen:
1766      assert self._proc, 'Error: closing an already closed context.'
1767      stdin = self._proc.stdin
1768      assert stdin is not None
1769      stdin.close()
1770      if self._proc.wait():
1771        stderr = self._proc.stderr
1772        assert stderr is not None
1773        s = stderr.read().decode('utf-8')
1774        raise RuntimeError(f"Error writing '{self.path}': {s}")
1775      self._popen.__exit__(None, None, None)
1776      self._popen = None
1777      self._proc = None
1778    if self._write_via_local_file:
1779      # pylint: disable-next=no-member
1780      self._write_via_local_file.__exit__(None, None, None)
1781      self._write_via_local_file = None

Finishes writing the video. (Called automatically at end of context.)

class VideoMetadata(typing.NamedTuple):
1265class VideoMetadata(NamedTuple):
1266  """Represents the data stored in a video container header.
1267
1268  Attributes:
1269    num_images: Number of frames that is expected from the video stream.  This
1270      is estimated from the framerate and the duration stored in the video
1271      header, so it might be inexact.  We set the value to -1 if number of
1272      frames is not found in the header.
1273    shape: The dimensions (height, width) of each video frame.
1274    fps: The framerate in frames per second.
1275    bps: The estimated bitrate of the video stream in bits per second, retrieved
1276      from the video header.
1277  """
1278
1279  num_images: int
1280  shape: tuple[int, int]
1281  fps: float
1282  bps: int | None

Represents the data stored in a video container header.

Attributes:
  • num_images: Number of frames that is expected from the video stream. This is estimated from the framerate and the duration stored in the video header, so it might be inexact. We set the value to -1 if number of frames is not found in the header.
  • shape: The dimensions (height, width) of each video frame.
  • fps: The framerate in frames per second.
  • bps: The estimated bitrate of the video stream in bits per second, retrieved from the video header.
def compress_image(image: ArrayLike, *, fmt: str = 'png', **kwargs: Any) -> bytes:
859def compress_image(
860    image: _ArrayLike, *, fmt: str = 'png', **kwargs: Any
861) -> bytes:
862  """Returns a buffer containing a compressed image.
863
864  Args:
865    image: Array in a format supported by `PIL`, e.g. np.uint8 or np.uint16.
866    fmt: Desired compression encoding, e.g. 'png'.
867    **kwargs: Options for `PIL.save()`, e.g. `optimize=True` for greater
868      compression.
869  """
870  image = _as_valid_media_array(image)
871  with io.BytesIO() as output:
872    _pil_image(image).save(output, format=fmt, **kwargs)
873    return output.getvalue()

Returns a buffer containing a compressed image.

Arguments:
  • image: Array in a format supported by PIL, e.g. np.uint8 or np.uint16.
  • fmt: Desired compression encoding, e.g. 'png'.
  • **kwargs: Options for PIL.save(), e.g. optimize=True for greater compression.
def decompress_image( data: bytes, dtype: DTypeLike = None, apply_exif_transpose: bool = True) -> np.ndarray:
876def decompress_image(
877    data: bytes, dtype: _DTypeLike = None, apply_exif_transpose: bool = True
878) -> _NDArray:
879  """Returns an image from a compressed data buffer.
880
881  Decoding is performed using `PIL`, which supports `uint8` images with 1, 3,
882  or 4 channels and `uint16` images with a single channel.
883
884  Args:
885    data: Buffer containing compressed image.
886    dtype: Data type of the returned array.  If None, `np.uint8` or `np.uint16`
887      is inferred automatically.
888    apply_exif_transpose: If True, rotate image according to EXIF orientation.
889  """
890  pil_image: PIL.Image.Image = PIL.Image.open(io.BytesIO(data))
891  if apply_exif_transpose:
892    tmp_image = PIL.ImageOps.exif_transpose(pil_image)  # Future: in_place=True.
893    assert tmp_image
894    pil_image = tmp_image
895  if dtype is None:
896    dtype = np.uint16 if pil_image.mode.startswith('I') else np.uint8
897  return np.array(pil_image, dtype=dtype)

Returns an image from a compressed data buffer.

Decoding is performed using PIL, which supports uint8 images with 1, 3, or 4 channels and uint16 images with a single channel.

Arguments:
  • data: Buffer containing compressed image.
  • dtype: Data type of the returned array. If None, np.uint8 or np.uint16 is inferred automatically.
  • apply_exif_transpose: If True, rotate image according to EXIF orientation.
def compress_video( images: Iterable[np.ndarray], *, codec: str = 'h264', **kwargs: Any) -> bytes:
1855def compress_video(
1856    images: Iterable[_NDArray], *, codec: str = 'h264', **kwargs: Any
1857) -> bytes:
1858  """Returns a buffer containing a compressed video.
1859
1860  The video container is 'gif' for 'gif' codec, 'webm' for 'vp9' codec,
1861  and mp4 otherwise.
1862
1863  >>> video = read_video('/tmp/river.mp4')
1864  >>> data = compress_video(video, bps=10_000_000)
1865  >>> print(len(data))
1866
1867  >>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10)
1868
1869  Args:
1870    images: Iterable over video frames.
1871    codec: Compression algorithm as defined by `ffmpeg -codecs` (e.g., 'h264',
1872      'hevc', 'vp9', or 'gif').
1873    **kwargs: Additional parameters for `VideoWriter`.
1874
1875  Returns:
1876    A bytes buffer containing the compressed video.
1877  """
1878  suffix = _filename_suffix_from_codec(codec)
1879  with tempfile.TemporaryDirectory() as directory_name:
1880    tmp_path = pathlib.Path(directory_name) / f'file{suffix}'
1881    write_video(tmp_path, images, codec=codec, **kwargs)
1882    return tmp_path.read_bytes()

Returns a buffer containing a compressed video.

The video container is 'gif' for 'gif' codec, 'webm' for 'vp9' codec, and mp4 otherwise.

>>> video = read_video('/tmp/river.mp4')
>>> data = compress_video(video, bps=10_000_000)
>>> print(len(data))
>>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10)
Arguments:
  • images: Iterable over video frames.
  • codec: Compression algorithm as defined by ffmpeg -codecs (e.g., 'h264', 'hevc', 'vp9', or 'gif').
  • **kwargs: Additional parameters for VideoWriter.
Returns:

A bytes buffer containing the compressed video.

def decompress_video(data: bytes, **kwargs: Any) -> np.ndarray:
1885def decompress_video(data: bytes, **kwargs: Any) -> _NDArray:
1886  """Returns video images from an MP4-compressed data buffer."""
1887  with tempfile.TemporaryDirectory() as directory_name:
1888    tmp_path = pathlib.Path(directory_name) / 'file.mp4'
1889    tmp_path.write_bytes(data)
1890    return read_video(tmp_path, **kwargs)

Returns video images from an MP4-compressed data buffer.

def html_from_compressed_image( data: bytes, width: int, height: int, *, title: str | None = None, border: bool | str = False, pixelated: bool = True, fmt: str = 'png') -> str:
900def html_from_compressed_image(
901    data: bytes,
902    width: int,
903    height: int,
904    *,
905    title: str | None = None,
906    border: bool | str = False,
907    pixelated: bool = True,
908    fmt: str = 'png',
909) -> str:
910  """Returns an HTML string with an image tag containing encoded data.
911
912  Args:
913    data: Compressed image bytes.
914    width: Width of HTML image in pixels.
915    height: Height of HTML image in pixels.
916    title: Optional text shown centered above image.
917    border: If `bool`, whether to place a black boundary around the image, or if
918      `str`, the boundary CSS style.
919    pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'.
920    fmt: Compression encoding.
921  """
922  b64 = base64.b64encode(data).decode('utf-8')
923  if isinstance(border, str):
924    border = f'{border}; '
925  elif border:
926    border = 'border:1px solid black; '
927  else:
928    border = ''
929  s_pixelated = 'pixelated' if pixelated else 'auto'
930  s = (
931      f'<img width="{width}" height="{height}"'
932      f' style="{border}image-rendering:{s_pixelated}; object-fit:cover;"'
933      f' src="data:image/{fmt};base64,{b64}"/>'
934  )
935  if title is not None:
936    s = f"""<div style="display:flex; align-items:left;">
937      <div style="display:flex; flex-direction:column; align-items:center;">
938      <div>{title}</div><div>{s}</div></div></div>"""
939  return s

Returns an HTML string with an image tag containing encoded data.

Arguments:
  • data: Compressed image bytes.
  • width: Width of HTML image in pixels.
  • height: Height of HTML image in pixels.
  • title: Optional text shown centered above image.
  • border: If bool, whether to place a black boundary around the image, or if str, the boundary CSS style.
  • pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'.
  • fmt: Compression encoding.
def html_from_compressed_video( data: bytes, width: int, height: int, *, title: str | None = None, border: bool | str = False, loop: bool = True, autoplay: bool = True) -> str:
1893def html_from_compressed_video(
1894    data: bytes,
1895    width: int,
1896    height: int,
1897    *,
1898    title: str | None = None,
1899    border: bool | str = False,
1900    loop: bool = True,
1901    autoplay: bool = True,
1902) -> str:
1903  """Returns an HTML string with a video tag containing H264-encoded data.
1904
1905  Args:
1906    data: MP4-compressed video bytes.
1907    width: Width of HTML video in pixels.
1908    height: Height of HTML video in pixels.
1909    title: Optional text shown centered above the video.
1910    border: If `bool`, whether to place a black boundary around the image, or if
1911      `str`, the boundary CSS style.
1912    loop: If True, the playback repeats forever.
1913    autoplay: If True, video playback starts without having to click.
1914  """
1915  b64 = base64.b64encode(data).decode('utf-8')
1916  if isinstance(border, str):
1917    border = f'{border}; '
1918  elif border:
1919    border = 'border:1px solid black; '
1920  else:
1921    border = ''
1922  options = (
1923      f'controls width="{width}" height="{height}"'
1924      f' style="{border}object-fit:cover;"'
1925      f'{" loop" if loop else ""}'
1926      f'{" autoplay muted" if autoplay else ""}'
1927  )
1928  s = f"""<video {options}>
1929      <source src="data:video/mp4;base64,{b64}" type="video/mp4"/>
1930      This browser does not support the video tag.
1931      </video>"""
1932  if title is not None:
1933    s = f"""<div style="display:flex; align-items:left;">
1934      <div style="display:flex; flex-direction:column; align-items:center;">
1935      <div>{title}</div><div>{s}</div></div></div>"""
1936  return s

Returns an HTML string with a video tag containing H264-encoded data.

Arguments:
  • data: MP4-compressed video bytes.
  • width: Width of HTML video in pixels.
  • height: Height of HTML video in pixels.
  • title: Optional text shown centered above the video.
  • border: If bool, whether to place a black boundary around the image, or if str, the boundary CSS style.
  • loop: If True, the playback repeats forever.
  • autoplay: If True, video playback starts without having to click.
def resize_image(image: ArrayLike, shape: tuple[int, int]) -> np.ndarray:
615def resize_image(image: _ArrayLike, shape: tuple[int, int]) -> _NDArray:
616  """Resizes image to specified spatial dimensions using a Lanczos filter.
617
618  Args:
619    image: Array-like 2D or 3D object, where dtype is uint or floating-point.
620    shape: 2D spatial dimensions (height, width) of output image.
621
622  Returns:
623    A resampled image whose spatial dimensions match `shape`.
624  """
625  image = _as_valid_media_array(image)
626  if image.ndim not in (2, 3):
627    raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.')
628  _check_2d_shape(shape)
629
630  # A PIL image can be multichannel only if it has 3 or 4 uint8 channels,
631  # and it can be resized only if it is uint8 or float32.
632  supported_single_channel = (
633      np.issubdtype(image.dtype, np.floating) or image.dtype == np.uint8
634  ) and image.ndim == 2
635  supported_multichannel = (
636      image.dtype == np.uint8 and image.ndim == 3 and image.shape[2] in (3, 4)
637  )
638  if supported_single_channel or supported_multichannel:
639    return np.array(
640        _pil_image(image).resize(
641            shape[::-1], resample=PIL.Image.Resampling.LANCZOS
642        ),
643        dtype=image.dtype,
644    )
645  if image.ndim == 2:
646    # We convert to floating-point for resizing and convert back.
647    return to_type(resize_image(to_float01(image), shape), image.dtype)
648  # We resize each image channel individually.
649  return np.dstack(
650      [resize_image(channel, shape) for channel in np.moveaxis(image, -1, 0)]
651  )

Resizes image to specified spatial dimensions using a Lanczos filter.

Arguments:
  • image: Array-like 2D or 3D object, where dtype is uint or floating-point.
  • shape: 2D spatial dimensions (height, width) of output image.
Returns:

A resampled image whose spatial dimensions match shape.

def resize_video(video: Iterable[np.ndarray], shape: tuple[int, int]) -> np.ndarray:
657def resize_video(video: Iterable[_NDArray], shape: tuple[int, int]) -> _NDArray:
658  """Resizes `video` to specified spatial dimensions using a Lanczos filter.
659
660  Args:
661    video: Iterable of images.
662    shape: 2D spatial dimensions (height, width) of output video.
663
664  Returns:
665    A resampled video whose spatial dimensions match `shape`.
666  """
667  _check_2d_shape(shape)
668  return np.array([resize_image(image, shape) for image in video])

Resizes video to specified spatial dimensions using a Lanczos filter.

Arguments:
  • video: Iterable of images.
  • shape: 2D spatial dimensions (height, width) of output video.
Returns:

A resampled video whose spatial dimensions match shape.

def to_rgb( array: ArrayLike, *, vmin: float | None = None, vmax: float | None = None, cmap: str | Callable[[ArrayLike], np.ndarray] = 'gray') -> np.ndarray:
815def to_rgb(
816    array: _ArrayLike,
817    *,
818    vmin: float | None = None,
819    vmax: float | None = None,
820    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
821) -> _NDArray:
822  """Maps scalar values to RGB using value bounds and a color map.
823
824  Args:
825    array: Scalar values, with arbitrary shape.
826    vmin: Explicit min value for remapping; if None, it is obtained as the
827      minimum finite value of `array`.
828    vmax: Explicit max value for remapping; if None, it is obtained as the
829      maximum finite value of `array`.
830    cmap: A `pyplot` color map or callable, to map from 1D value to 3D or 4D
831      color.
832
833  Returns:
834    A new array in which each element is affinely mapped from [vmin, vmax]
835    to [0.0, 1.0] and then color-mapped.
836  """
837  a = _as_valid_media_array(array)
838  del array
839  # For future numpy version 1.7.0:
840  # vmin = np.amin(a, where=np.isfinite(a)) if vmin is None else vmin
841  # vmax = np.amax(a, where=np.isfinite(a)) if vmax is None else vmax
842  vmin = np.amin(np.where(np.isfinite(a), a, np.inf)) if vmin is None else vmin
843  vmax = np.amax(np.where(np.isfinite(a), a, -np.inf)) if vmax is None else vmax
844  a = (a.astype('float') - vmin) / (vmax - vmin + np.finfo(float).eps)
845  if isinstance(cmap, str):
846    if hasattr(matplotlib, 'colormaps'):
847      rgb_from_scalar: Any = matplotlib.colormaps[cmap]  # Newer version.
848    else:
849      rgb_from_scalar = matplotlib.pyplot.cm.get_cmap(cmap)  # pylint: disable=no-member
850  else:
851    rgb_from_scalar = cmap
852  a = cast(_NDArray, rgb_from_scalar(a))
853  # If there is a fully opaque alpha channel, remove it.
854  if a.shape[-1] == 4 and np.all(to_float01(a[..., 3])) == 1.0:
855    a = a[..., :3]
856  return a

Maps scalar values to RGB using value bounds and a color map.

Arguments:
  • array: Scalar values, with arbitrary shape.
  • vmin: Explicit min value for remapping; if None, it is obtained as the minimum finite value of array.
  • vmax: Explicit max value for remapping; if None, it is obtained as the maximum finite value of array.
  • cmap: A pyplot color map or callable, to map from 1D value to 3D or 4D color.
Returns:

A new array in which each element is affinely mapped from [vmin, vmax] to [0.0, 1.0] and then color-mapped.

def to_type(array: ArrayLike, dtype: DTypeLike) -> np.ndarray:
376def to_type(array: _ArrayLike, dtype: _DTypeLike) -> _NDArray:
377  """Returns media array converted to specified type.
378
379  A "media array" is one in which the dtype is either a floating-point type
380  (np.float32 or np.float64) or an unsigned integer type.  The array values are
381  assumed to lie in the range [0.0, 1.0] for floating-point values, and in the
382  full range for unsigned integers, e.g. [0, 255] for np.uint8.
383
384  Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to
385  1.0.  The input array may also be of type bool, whereby True maps to
386  uint(MAX) or 1.0.  The values are scaled and clamped as appropriate during
387  type conversions.
388
389  Args:
390    array: Input array-like object (floating-point, unsigned int, or bool).
391    dtype: Desired output type (floating-point or unsigned int).
392
393  Returns:
394    Array `a` if it is already of the specified dtype, else a converted array.
395  """
396  a = np.asarray(array)
397  dtype = np.dtype(dtype)
398  del array
399  if a.dtype != bool:
400    _as_valid_media_type(a.dtype)  # Verify that 'a' has a valid dtype.
401  if a.dtype == bool:
402    result = a.astype(dtype)
403    if np.issubdtype(dtype, np.unsignedinteger):
404      result = result * dtype.type(np.iinfo(dtype).max)
405  elif a.dtype == dtype:
406    result = a
407  elif np.issubdtype(dtype, np.unsignedinteger):
408    if np.issubdtype(a.dtype, np.unsignedinteger):
409      src_max: float = np.iinfo(a.dtype).max
410    else:
411      a = np.clip(a, 0.0, 1.0)
412      src_max = 1.0
413    dst_max = np.iinfo(dtype).max
414    if dst_max <= np.iinfo(np.uint16).max:
415      scale = np.array(dst_max / src_max, dtype=np.float32)
416      result = (a * scale + 0.5).astype(dtype)
417    elif dst_max <= np.iinfo(np.uint32).max:
418      result = (a.astype(np.float64) * (dst_max / src_max) + 0.5).astype(dtype)
419    else:
420      # https://stackoverflow.com/a/66306123/
421      a = a.astype(np.float64) * (dst_max / src_max) + 0.5
422      dst = np.atleast_1d(a)
423      values_too_large = dst >= np.float64(dst_max)
424      with np.errstate(invalid='ignore'):
425        dst = dst.astype(dtype)
426      dst[values_too_large] = dst_max
427      result = dst if a.ndim > 0 else dst[0]
428  else:
429    assert np.issubdtype(dtype, np.floating)
430    result = a.astype(dtype)
431    if np.issubdtype(a.dtype, np.unsignedinteger):
432      result = result / dtype.type(np.iinfo(a.dtype).max)
433  return result

Returns media array converted to specified type.

A "media array" is one in which the dtype is either a floating-point type (np.float32 or np.float64) or an unsigned integer type. The array values are assumed to lie in the range [0.0, 1.0] for floating-point values, and in the full range for unsigned integers, e.g. [0, 255] for np.uint8.

Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to 1.0. The input array may also be of type bool, whereby True maps to uint(MAX) or 1.0. The values are scaled and clamped as appropriate during type conversions.

Arguments:
  • array: Input array-like object (floating-point, unsigned int, or bool).
  • dtype: Desired output type (floating-point or unsigned int).
Returns:

Array a if it is already of the specified dtype, else a converted array.

def to_float01( a: ArrayLike, dtype: DTypeLike = <class 'numpy.float32'>) -> np.ndarray:
436def to_float01(a: _ArrayLike, dtype: _DTypeLike = np.float32) -> _NDArray:
437  """If array has unsigned integers, rescales them to the range [0.0, 1.0].
438
439  Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0.  See
440  `to_type`.
441
442  Args:
443    a: Input array.
444    dtype: Desired floating-point type if rescaling occurs.
445
446  Returns:
447    A new array of dtype values in the range [0.0, 1.0] if the input array `a`
448    contains unsigned integers; otherwise, array `a` is returned unchanged.
449  """
450  a = np.asarray(a)
451  dtype = np.dtype(dtype)
452  if not np.issubdtype(dtype, np.floating):
453    raise ValueError(f'Type {dtype} is not floating-point.')
454  if np.issubdtype(a.dtype, np.floating):
455    return a
456  return to_type(a, dtype)

If array has unsigned integers, rescales them to the range [0.0, 1.0].

Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0. See to_type.

Arguments:
  • a: Input array.
  • dtype: Desired floating-point type if rescaling occurs.
Returns:

A new array of dtype values in the range [0.0, 1.0] if the input array a contains unsigned integers; otherwise, array a is returned unchanged.

def to_uint8(a: ArrayLike) -> np.ndarray:
459def to_uint8(a: _ArrayLike) -> _NDArray:
460  """Returns array converted to uint8 values; see `to_type`."""
461  return to_type(a, np.uint8)

Returns array converted to uint8 values; see to_type.

def set_output_height(num_pixels: int) -> None:
329def set_output_height(num_pixels: int) -> None:
330  """Overrides the height of the current output cell, if using Colab."""
331  try:
332    # We want to fail gracefully for non-Colab IPython notebooks.
333    output = importlib.import_module('google.colab.output')
334    s = f'google.colab.output.setIframeHeight("{num_pixels}px")'
335    output.eval_js(s)
336  except (ModuleNotFoundError, AttributeError):
337    pass

Overrides the height of the current output cell, if using Colab.

def set_max_output_height(num_pixels: int) -> None:
340def set_max_output_height(num_pixels: int) -> None:
341  """Sets the maximum height of the current output cell, if using Colab."""
342  try:
343    # We want to fail gracefully for non-Colab IPython notebooks.
344    output = importlib.import_module('google.colab.output')
345    s = (
346        'google.colab.output.setIframeHeight('
347        f'0, true, {{maxHeight: {num_pixels}}})'
348    )
349    output.eval_js(s)
350  except (ModuleNotFoundError, AttributeError):
351    pass

Sets the maximum height of the current output cell, if using Colab.

def color_ramp( shape: tuple[int, int] = (64, 64), *, dtype: DTypeLike = <class 'numpy.float32'>) -> np.ndarray:
467def color_ramp(
468    shape: tuple[int, int] = (64, 64), *, dtype: _DTypeLike = np.float32
469) -> _NDArray:
470  """Returns an image of a red-green color gradient.
471
472  This is useful for quick experimentation and testing.  See also
473  `moving_circle` to generate a sample video.
474
475  Args:
476    shape: 2D spatial dimensions (height, width) of generated image.
477    dtype: Type (uint or floating) of resulting pixel values.
478  """
479  _check_2d_shape(shape)
480  dtype = _as_valid_media_type(dtype)
481  yx = (np.moveaxis(np.indices(shape), 0, -1) + 0.5) / shape
482  image = np.insert(yx, 2, 0.0, axis=-1)
483  return to_type(image, dtype)

Returns an image of a red-green color gradient.

This is useful for quick experimentation and testing. See also moving_circle to generate a sample video.

Arguments:
  • shape: 2D spatial dimensions (height, width) of generated image.
  • dtype: Type (uint or floating) of resulting pixel values.
def moving_circle( shape: tuple[int, int] = (256, 256), num_images: int = 10, *, dtype: DTypeLike = <class 'numpy.float32'>) -> np.ndarray:
486def moving_circle(
487    shape: tuple[int, int] = (256, 256),
488    num_images: int = 10,
489    *,
490    dtype: _DTypeLike = np.float32,
491) -> _NDArray:
492  """Returns a video of a circle moving in front of a color ramp.
493
494  This is useful for quick experimentation and testing.  See also `color_ramp`
495  to generate a sample image.
496
497  >>> show_video(moving_circle((480, 640), 60), fps=60)
498
499  Args:
500    shape: 2D spatial dimensions (height, width) of generated video.
501    num_images: Number of video frames.
502    dtype: Type (uint or floating) of resulting pixel values.
503  """
504  _check_2d_shape(shape)
505  dtype = np.dtype(dtype)
506
507  def generate_image(image_index: int) -> _NDArray:
508    """Returns a video frame image."""
509    image = color_ramp(shape, dtype=dtype)
510    yx = np.moveaxis(np.indices(shape), 0, -1)
511    center = shape[0] * 0.6, shape[1] * (image_index + 0.5) / num_images
512    radius_squared = (min(shape) * 0.1) ** 2
513    inside = np.sum((yx - center) ** 2, axis=-1) < radius_squared
514    white_circle_color = 1.0, 1.0, 1.0
515    if np.issubdtype(dtype, np.unsignedinteger):
516      white_circle_color = to_type([white_circle_color], dtype)[0]
517    image[inside] = white_circle_color
518    return image
519
520  return np.array([generate_image(i) for i in range(num_images)])

Returns a video of a circle moving in front of a color ramp.

This is useful for quick experimentation and testing. See also color_ramp to generate a sample image.

>>> show_video(moving_circle((480, 640), 60), fps=60)
Arguments:
  • shape: 2D spatial dimensions (height, width) of generated video.
  • num_images: Number of video frames.
  • dtype: Type (uint or floating) of resulting pixel values.
class set_show_save_dir:
736class set_show_save_dir:  # pylint: disable=invalid-name
737  """Save all titled output from `show_*()` calls into files.
738
739  If the specified `directory` is not None, all titled images and videos
740  displayed by `show_image`, `show_images`, `show_video`, and `show_videos` are
741  also saved as files within the directory.
742
743  It can be used either to set the state or as a context manager:
744
745  >>> set_show_save_dir('/tmp')
746  >>> show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
747  >>> show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
748  >>> set_show_save_dir(None)
749
750  >>> with set_show_save_dir('/tmp'):
751  ...   show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
752  ...   show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
753  """
754
755  def __init__(self, directory: _Path | None):
756    self._old_show_save_dir = _config.show_save_dir
757    _config.show_save_dir = directory
758
759  def __enter__(self) -> None:
760    pass
761
762  def __exit__(self, *_: Any) -> None:
763    _config.show_save_dir = self._old_show_save_dir

Save all titled output from show_*() calls into files.

If the specified directory is not None, all titled images and videos displayed by show_image, show_images, show_video, and show_videos are also saved as files within the directory.

It can be used either to set the state or as a context manager:

>>> set_show_save_dir('/tmp')
>>> show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
>>> show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
>>> set_show_save_dir(None)
>>> with set_show_save_dir('/tmp'):
...   show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
...   show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
set_show_save_dir(directory: str | os.PathLike[str] | None)
755  def __init__(self, directory: _Path | None):
756    self._old_show_save_dir = _config.show_save_dir
757    _config.show_save_dir = directory
def set_ffmpeg(name_or_path: str | os.PathLike[str]) -> None:
315def set_ffmpeg(name_or_path: _Path) -> None:
316  """Specifies the name or path for the `ffmpeg` external program.
317
318  The `ffmpeg` program is required for compressing and decompressing video.
319  (It is used in `read_video`, `write_video`, `show_video`, `show_videos`,
320  etc.)
321
322  Args:
323    name_or_path: Either a filename within a directory of `os.environ['PATH']`
324      or a filepath.  The default setting is 'ffmpeg'.
325  """
326  _config.ffmpeg_name_or_path = name_or_path

Specifies the name or path for the ffmpeg external program.

The ffmpeg program is required for compressing and decompressing video. (It is used in read_video, write_video, show_video, show_videos, etc.)

Arguments:
  • name_or_path: Either a filename within a directory of os.environ['PATH'] or a filepath. The default setting is 'ffmpeg'.
def video_is_available() -> bool:
1257def video_is_available() -> bool:
1258  """Returns True if the program `ffmpeg` is found.
1259
1260  See also `set_ffmpeg`.
1261  """
1262  return _search_for_ffmpeg_path() is not None

Returns True if the program ffmpeg is found.

See also set_ffmpeg.