mediapy

mediapy: Read/write/show images and videos in an IPython/Jupyter notebook.

[GitHub source]   [API docs]   [PyPI package]   [Colab example]

See the example notebook, or better yet, open it in Colab.

Image examples

Display an image (2D or 3D numpy array):

checkerboard = np.kron([[0, 1] * 16, [1, 0] * 16] * 16, np.ones((4, 4)))
show_image(checkerboard)

Read and display an image (either local or from the Web):

IMAGE = 'https://github.com/hhoppe/data/raw/main/image.png'
show_image(read_image(IMAGE))

Read and display an image from a local file:

!wget -q -O /tmp/burano.png {IMAGE}
show_image(read_image('/tmp/burano.png'))

Show titled images side-by-side:

images = {
    'original': checkerboard,
    'darkened': checkerboard * 0.7,
    'random': np.random.rand(32, 32, 3),
}
show_images(images, vmin=0.0, vmax=1.0, border=True, height=64)

Compare two images using an interactive slider:

compare_images([checkerboard, np.random.rand(128, 128, 3)])

Video examples

Display a video (an iterable of images, e.g., a 3D or 4D array):

video = moving_circle((100, 100), num_images=10)
show_video(video, fps=10)

Show the video frames side-by-side:

show_images(video, columns=6, border=True, height=64)

Show the frames with their indices:

show_images({f'{i}': image for i, image in enumerate(video)}, width=32)

Read and display a video (either local or from the Web):

VIDEO = 'https://github.com/hhoppe/data/raw/main/video.mp4'
show_video(read_video(VIDEO))

Create and display a looping two-frame GIF video:

image1 = resize_image(np.random.rand(10, 10, 3), (50, 50))
show_video([image1, image1 * 0.8], fps=2, codec='gif')

Darken a video frame-by-frame:

output_path = '/tmp/out.mp4'
with VideoReader(VIDEO) as r:
  darken_image = lambda image: to_float01(image) * 0.5
  with VideoWriter(output_path, shape=r.shape, fps=r.fps, bps=r.bps) as w:
    for image in r:
      w.add_image(darken_image(image))
   1# Copyright 2024 The mediapy Authors.
   2#
   3# Licensed under the Apache License, Version 2.0 (the "License");
   4# you may not use this file except in compliance with the License.
   5# You may obtain a copy of the License at
   6#
   7#     http://www.apache.org/licenses/LICENSE-2.0
   8#
   9# Unless required by applicable law or agreed to in writing, software
  10# distributed under the License is distributed on an "AS IS" BASIS,
  11# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  12# See the License for the specific language governing permissions and
  13# limitations under the License.
  14
  15"""`mediapy`: Read/write/show images and videos in an IPython/Jupyter notebook.
  16
  17[**[GitHub source]**](https://github.com/google/mediapy)  
  18[**[API docs]**](https://google.github.io/mediapy/)  
  19[**[PyPI package]**](https://pypi.org/project/mediapy/)  
  20[**[Colab
  21example]**](https://colab.research.google.com/github/google/mediapy/blob/main/mediapy_examples.ipynb)
  22
  23See the [example
  24notebook](https://github.com/google/mediapy/blob/main/mediapy_examples.ipynb),
  25or better yet, [**open it in
  26Colab**](https://colab.research.google.com/github/google/mediapy/blob/main/mediapy_examples.ipynb).
  27
  28## Image examples
  29
  30Display an image (2D or 3D `numpy` array):
  31```python
  32checkerboard = np.kron([[0, 1] * 16, [1, 0] * 16] * 16, np.ones((4, 4)))
  33show_image(checkerboard)
  34```
  35
  36Read and display an image (either local or from the Web):
  37```python
  38IMAGE = 'https://github.com/hhoppe/data/raw/main/image.png'
  39show_image(read_image(IMAGE))
  40```
  41
  42Read and display an image from a local file:
  43```python
  44!wget -q -O /tmp/burano.png {IMAGE}
  45show_image(read_image('/tmp/burano.png'))
  46```
  47
  48Show titled images side-by-side:
  49```python
  50images = {
  51    'original': checkerboard,
  52    'darkened': checkerboard * 0.7,
  53    'random': np.random.rand(32, 32, 3),
  54}
  55show_images(images, vmin=0.0, vmax=1.0, border=True, height=64)
  56```
  57
  58Compare two images using an interactive slider:
  59```python
  60compare_images([checkerboard, np.random.rand(128, 128, 3)])
  61```
  62
  63## Video examples
  64
  65Display a video (an iterable of images, e.g., a 3D or 4D array):
  66```python
  67video = moving_circle((100, 100), num_images=10)
  68show_video(video, fps=10)
  69```
  70
  71Show the video frames side-by-side:
  72```python
  73show_images(video, columns=6, border=True, height=64)
  74```
  75
  76Show the frames with their indices:
  77```python
  78show_images({f'{i}': image for i, image in enumerate(video)}, width=32)
  79```
  80
  81Read and display a video (either local or from the Web):
  82```python
  83VIDEO = 'https://github.com/hhoppe/data/raw/main/video.mp4'
  84show_video(read_video(VIDEO))
  85```
  86
  87Create and display a looping two-frame GIF video:
  88```python
  89image1 = resize_image(np.random.rand(10, 10, 3), (50, 50))
  90show_video([image1, image1 * 0.8], fps=2, codec='gif')
  91```
  92
  93Darken a video frame-by-frame:
  94```python
  95output_path = '/tmp/out.mp4'
  96with VideoReader(VIDEO) as r:
  97  darken_image = lambda image: to_float01(image) * 0.5
  98  with VideoWriter(output_path, shape=r.shape, fps=r.fps, bps=r.bps) as w:
  99    for image in r:
 100      w.add_image(darken_image(image))
 101```
 102"""
 103
 104from __future__ import annotations
 105
 106__docformat__ = 'google'
 107__version__ = '1.2.2'
 108__version_info__ = tuple(int(num) for num in __version__.split('.'))
 109
 110import base64
 111from collections.abc import Callable, Iterable, Iterator, Mapping, Sequence
 112import contextlib
 113import functools
 114import importlib
 115import io
 116import itertools
 117import math
 118import numbers
 119import os  # Package only needed for typing.TYPE_CHECKING.
 120import pathlib
 121import re
 122import shlex
 123import shutil
 124import subprocess
 125import sys
 126import tempfile
 127import typing
 128from typing import Any
 129import urllib.request
 130
 131import IPython.display
 132import matplotlib.pyplot
 133import numpy as np
 134import numpy.typing as npt
 135import PIL.Image
 136import PIL.ImageOps
 137
 138
 139if not hasattr(PIL.Image, 'Resampling'):  # Allow Pillow<9.0.
 140  PIL.Image.Resampling = PIL.Image  # type: ignore
 141
 142# Selected and reordered here for pdoc documentation.
 143__all__ = [
 144    'show_image',
 145    'show_images',
 146    'compare_images',
 147    'show_video',
 148    'show_videos',
 149    'read_image',
 150    'write_image',
 151    'read_video',
 152    'write_video',
 153    'VideoReader',
 154    'VideoWriter',
 155    'VideoMetadata',
 156    'compress_image',
 157    'decompress_image',
 158    'compress_video',
 159    'decompress_video',
 160    'html_from_compressed_image',
 161    'html_from_compressed_video',
 162    'resize_image',
 163    'resize_video',
 164    'to_rgb',
 165    'to_type',
 166    'to_float01',
 167    'to_uint8',
 168    'set_output_height',
 169    'set_max_output_height',
 170    'color_ramp',
 171    'moving_circle',
 172    'set_show_save_dir',
 173    'set_ffmpeg',
 174    'video_is_available',
 175]
 176
 177if TYPE_CHECKING:
 178  _ArrayLike = npt.ArrayLike
 179  _DTypeLike = npt.DTypeLike
 180  _NDArray = np.ndarray[Any, Any]
 181  _DType = np.dtype[Any]
 182else:
 183  # Create named types for use in the `pdoc` documentation.
 184  _ArrayLike = TypeVar('_ArrayLike')
 185  _DTypeLike = TypeVar('_DTypeLike')
 186  _NDArray = TypeVar('_NDArray')
 187  _DType = TypeVar('_DType')  # pylint: disable=invalid-name
 188
 189_IPYTHON_HTML_SIZE_LIMIT = 20_000_000
 190_T = TypeVar('_T')
 191_Path = Union[str, 'os.PathLike[str]']
 192
 193_IMAGE_COMPARISON_HTML = """\
 194<script
 195  defer
 196  src="https://unpkg.com/img-comparison-slider@7/dist/index.js"
 197></script>
 198<link
 199  rel="stylesheet"
 200  href="https://unpkg.com/img-comparison-slider@7/dist/styles.css"
 201/>
 202
 203<img-comparison-slider>
 204  <img slot="first" src="data:image/png;base64,{b64_1}" />
 205  <img slot="second" src="data:image/png;base64,{b64_2}" />
 206</img-comparison-slider>
 207"""
 208
 209# ** Miscellaneous.
 210
 211
 212class _Config:
 213  ffmpeg_name_or_path: _Path = 'ffmpeg'
 214  show_save_dir: _Path | None = None
 215
 216
 217_config = _Config()
 218
 219
 220def _open(path: _Path, *args: Any, **kwargs: Any) -> Any:
 221  """Opens the file; this is a hook for the built-in `open()`."""
 222  return open(path, *args, **kwargs)
 223
 224
 225def _path_is_local(path: _Path) -> bool:
 226  """Returns True if the path is in the filesystem accessible by `ffmpeg`."""
 227  del path
 228  return True
 229
 230
 231def _search_for_ffmpeg_path() -> str | None:
 232  """Returns a path to the ffmpeg program, or None if not found."""
 233  if filename := shutil.which(_config.ffmpeg_name_or_path):
 234    return str(filename)
 235  return None
 236
 237
 238def _print_err(*args: str, **kwargs: Any) -> None:
 239  """Prints arguments to stderr immediately."""
 240  kwargs = {**dict(file=sys.stderr, flush=True), **kwargs}
 241  print(*args, **kwargs)
 242
 243
 244def _chunked(
 245    iterable: Iterable[_T], n: int | None = None
 246) -> Iterator[tuple[_T, ...]]:
 247  """Returns elements collected as tuples of length at most `n` if not None."""
 248
 249  def take(n: int, iterable: Iterable[_T]) -> tuple[_T, ...]:
 250    return tuple(itertools.islice(iterable, n))
 251
 252  return iter(functools.partial(take, n, iter(iterable)), ())
 253
 254
 255def _peek_first(iterator: Iterable[_T]) -> tuple[_T, Iterable[_T]]:
 256  """Given an iterator, returns first element and re-initialized iterator.
 257
 258  >>> first_image, images = _peek_first(moving_circle())
 259
 260  Args:
 261    iterator: An input iterator or iterable.
 262
 263  Returns:
 264    A tuple (first_element, iterator_reinitialized) containing:
 265      first_element: The first element of the input.
 266      iterator_reinitialized: A clone of the original iterator/iterable.
 267  """
 268  # Inspired from https://stackoverflow.com/a/12059829/1190077
 269  peeker, iterator_reinitialized = itertools.tee(iterator)
 270  first = next(peeker)
 271  return first, iterator_reinitialized
 272
 273
 274def _check_2d_shape(shape: tuple[int, int]) -> None:
 275  """Checks that `shape` is of the form (height, width) with two integers."""
 276  if len(shape) != 2:
 277    raise ValueError(f'Shape {shape} is not of the form (height, width).')
 278  if not all(isinstance(i, numbers.Integral) for i in shape):
 279    raise ValueError(f'Shape {shape} contains non-integers.')
 280
 281
 282def _run(args: str | Sequence[str]) -> None:
 283  """Executes command, printing output from stdout and stderr.
 284
 285  Args:
 286    args: Command to execute, which can be either a string or a sequence of word
 287      strings, as in `subprocess.run()`.  If `args` is a string, the shell is
 288      invoked to interpret it.
 289
 290  Raises:
 291    RuntimeError: If the command's exit code is nonzero.
 292  """
 293  proc = subprocess.run(
 294      args,
 295      shell=isinstance(args, str),
 296      stdout=subprocess.PIPE,
 297      stderr=subprocess.STDOUT,
 298      check=False,
 299      universal_newlines=True,
 300  )
 301  print(proc.stdout, end='', flush=True)
 302  if proc.returncode:
 303    raise RuntimeError(
 304        f"Command '{proc.args}' failed with code {proc.returncode}."
 305    )
 306
 307
 308def _display_html(text: str, /) -> None:
 309  """In a Jupyter notebook, display the HTML `text`."""
 310  IPython.display.display(IPython.display.HTML(text))  # type: ignore
 311
 312
 313def set_ffmpeg(name_or_path: _Path) -> None:
 314  """Specifies the name or path for the `ffmpeg` external program.
 315
 316  The `ffmpeg` program is required for compressing and decompressing video.
 317  (It is used in `read_video`, `write_video`, `show_video`, `show_videos`,
 318  etc.)
 319
 320  Args:
 321    name_or_path: Either a filename within a directory of `os.environ['PATH']`
 322      or a filepath.  The default setting is 'ffmpeg'.
 323  """
 324  _config.ffmpeg_name_or_path = name_or_path
 325
 326
 327def set_output_height(num_pixels: int) -> None:
 328  """Overrides the height of the current output cell, if using Colab."""
 329  try:
 330    # We want to fail gracefully for non-Colab IPython notebooks.
 331    output = importlib.import_module('google.colab.output')
 332    s = f'google.colab.output.setIframeHeight("{num_pixels}px")'
 333    output.eval_js(s)
 334  except (ModuleNotFoundError, AttributeError):
 335    pass
 336
 337
 338def set_max_output_height(num_pixels: int) -> None:
 339  """Sets the maximum height of the current output cell, if using Colab."""
 340  try:
 341    # We want to fail gracefully for non-Colab IPython notebooks.
 342    output = importlib.import_module('google.colab.output')
 343    s = (
 344        'google.colab.output.setIframeHeight('
 345        f'0, true, {{maxHeight: {num_pixels}}})'
 346    )
 347    output.eval_js(s)
 348  except (ModuleNotFoundError, AttributeError):
 349    pass
 350
 351
 352# ** Type conversions.
 353
 354
 355def _as_valid_media_type(dtype: _DTypeLike) -> _DType:
 356  """Returns validated media data type."""
 357  dtype = np.dtype(dtype)
 358  if not issubclass(dtype.type, (np.unsignedinteger, np.floating)):
 359    raise ValueError(
 360        f'Type {dtype} is not a valid media data type (uint or float).'
 361    )
 362  return dtype
 363
 364
 365def _as_valid_media_array(x: _ArrayLike) -> _NDArray:
 366  """Converts to ndarray (if not already), and checks validity of data type."""
 367  a = np.asarray(x)
 368  if a.dtype == bool:
 369    a = a.astype(np.uint8) * np.iinfo(np.uint8).max
 370  _as_valid_media_type(a.dtype)
 371  return a
 372
 373
 374def to_type(array: _ArrayLike, dtype: _DTypeLike) -> _NDArray:
 375  """Returns media array converted to specified type.
 376
 377  A "media array" is one in which the dtype is either a floating-point type
 378  (np.float32 or np.float64) or an unsigned integer type.  The array values are
 379  assumed to lie in the range [0.0, 1.0] for floating-point values, and in the
 380  full range for unsigned integers, e.g. [0, 255] for np.uint8.
 381
 382  Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to
 383  1.0.  The input array may also be of type bool, whereby True maps to
 384  uint(MAX) or 1.0.  The values are scaled and clamped as appropriate during
 385  type conversions.
 386
 387  Args:
 388    array: Input array-like object (floating-point, unsigned int, or bool).
 389    dtype: Desired output type (floating-point or unsigned int).
 390
 391  Returns:
 392    Array `a` if it is already of the specified dtype, else a converted array.
 393  """
 394  a = np.asarray(array)
 395  dtype = np.dtype(dtype)
 396  del array
 397  if a.dtype != bool:
 398    _as_valid_media_type(a.dtype)  # Verify that 'a' has a valid dtype.
 399  if a.dtype == bool:
 400    result = a.astype(dtype)
 401    if np.issubdtype(dtype, np.unsignedinteger):
 402      result = result * dtype.type(np.iinfo(dtype).max)
 403  elif a.dtype == dtype:
 404    result = a
 405  elif np.issubdtype(dtype, np.unsignedinteger):
 406    if np.issubdtype(a.dtype, np.unsignedinteger):
 407      src_max: float = np.iinfo(a.dtype).max
 408    else:
 409      a = np.clip(a, 0.0, 1.0)
 410      src_max = 1.0
 411    dst_max = np.iinfo(dtype).max
 412    if dst_max <= np.iinfo(np.uint16).max:
 413      scale = np.array(dst_max / src_max, dtype=np.float32)
 414      result = (a * scale + 0.5).astype(dtype)
 415    elif dst_max <= np.iinfo(np.uint32).max:
 416      result = (a.astype(np.float64) * (dst_max / src_max) + 0.5).astype(dtype)
 417    else:
 418      # https://stackoverflow.com/a/66306123/
 419      a = a.astype(np.float64) * (dst_max / src_max) + 0.5
 420      dst = np.atleast_1d(a)
 421      values_too_large = dst >= np.float64(dst_max)
 422      with np.errstate(invalid='ignore'):
 423        dst = dst.astype(dtype)
 424      dst[values_too_large] = dst_max
 425      result = dst if a.ndim > 0 else dst[0]
 426  else:
 427    assert np.issubdtype(dtype, np.floating)
 428    result = a.astype(dtype)
 429    if np.issubdtype(a.dtype, np.unsignedinteger):
 430      result = result / dtype.type(np.iinfo(a.dtype).max)
 431  return result
 432
 433
 434def to_float01(a: _ArrayLike, dtype: _DTypeLike = np.float32) -> _NDArray:
 435  """If array has unsigned integers, rescales them to the range [0.0, 1.0].
 436
 437  Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0.  See
 438  `to_type`.
 439
 440  Args:
 441    a: Input array.
 442    dtype: Desired floating-point type if rescaling occurs.
 443
 444  Returns:
 445    A new array of dtype values in the range [0.0, 1.0] if the input array `a`
 446    contains unsigned integers; otherwise, array `a` is returned unchanged.
 447  """
 448  a = np.asarray(a)
 449  dtype = np.dtype(dtype)
 450  if not np.issubdtype(dtype, np.floating):
 451    raise ValueError(f'Type {dtype} is not floating-point.')
 452  if np.issubdtype(a.dtype, np.floating):
 453    return a
 454  return to_type(a, dtype)
 455
 456
 457def to_uint8(a: _ArrayLike) -> _NDArray:
 458  """Returns array converted to uint8 values; see `to_type`."""
 459  return to_type(a, np.uint8)
 460
 461
 462# ** Functions to generate example image and video data.
 463
 464
 465def color_ramp(
 466    shape: tuple[int, int] = (64, 64), *, dtype: _DTypeLike = np.float32
 467) -> _NDArray:
 468  """Returns an image of a red-green color gradient.
 469
 470  This is useful for quick experimentation and testing.  See also
 471  `moving_circle` to generate a sample video.
 472
 473  Args:
 474    shape: 2D spatial dimensions (height, width) of generated image.
 475    dtype: Type (uint or floating) of resulting pixel values.
 476  """
 477  _check_2d_shape(shape)
 478  dtype = _as_valid_media_type(dtype)
 479  yx = (np.moveaxis(np.indices(shape), 0, -1) + 0.5) / shape
 480  image = np.insert(yx, 2, 0.0, axis=-1)
 481  return to_type(image, dtype)
 482
 483
 484def moving_circle(
 485    shape: tuple[int, int] = (256, 256),
 486    num_images: int = 10,
 487    *,
 488    dtype: _DTypeLike = np.float32,
 489) -> _NDArray:
 490  """Returns a video of a circle moving in front of a color ramp.
 491
 492  This is useful for quick experimentation and testing.  See also `color_ramp`
 493  to generate a sample image.
 494
 495  >>> show_video(moving_circle((480, 640), 60), fps=60)
 496
 497  Args:
 498    shape: 2D spatial dimensions (height, width) of generated video.
 499    num_images: Number of video frames.
 500    dtype: Type (uint or floating) of resulting pixel values.
 501  """
 502  _check_2d_shape(shape)
 503  dtype = np.dtype(dtype)
 504
 505  def generate_image(image_index: int) -> _NDArray:
 506    """Returns a video frame image."""
 507    image = color_ramp(shape, dtype=dtype)
 508    yx = np.moveaxis(np.indices(shape), 0, -1)
 509    center = shape[0] * 0.6, shape[1] * (image_index + 0.5) / num_images
 510    radius_squared = (min(shape) * 0.1) ** 2
 511    inside = np.sum((yx - center) ** 2, axis=-1) < radius_squared
 512    white_circle_color = 1.0, 1.0, 1.0
 513    if np.issubdtype(dtype, np.unsignedinteger):
 514      white_circle_color = to_type([white_circle_color], dtype)[0]
 515    image[inside] = white_circle_color
 516    return image
 517
 518  return np.array([generate_image(i) for i in range(num_images)])
 519
 520
 521# ** Color-space conversions.
 522
 523# Same matrix values as in two sources:
 524# https://github.com/scikit-image/scikit-image/blob/master/skimage/color/colorconv.py#L377
 525# https://github.com/tensorflow/tensorflow/blob/r1.14/tensorflow/python/ops/image_ops_impl.py#L2754
 526_YUV_FROM_RGB_MATRIX = np.array(
 527    [
 528        [0.299, -0.14714119, 0.61497538],
 529        [0.587, -0.28886916, -0.51496512],
 530        [0.114, 0.43601035, -0.10001026],
 531    ],
 532    dtype=np.float32,
 533)
 534_RGB_FROM_YUV_MATRIX = np.linalg.inv(_YUV_FROM_RGB_MATRIX)
 535_YUV_CHROMA_OFFSET = np.array([0.0, 0.5, 0.5], dtype=np.float32)
 536
 537
 538def yuv_from_rgb(rgb: _ArrayLike) -> _NDArray:
 539  """Returns the RGB image/video mapped to YUV [0,1] color space.
 540
 541  Note that the "YUV" color space used by video compressors is actually YCbCr!
 542
 543  Args:
 544    rgb: Input image in sRGB space.
 545  """
 546  rgb = to_float01(rgb)
 547  if rgb.shape[-1] != 3:
 548    raise ValueError(f'The last dimension in {rgb.shape} is not 3.')
 549  return rgb @ _YUV_FROM_RGB_MATRIX + _YUV_CHROMA_OFFSET
 550
 551
 552def rgb_from_yuv(yuv: _ArrayLike) -> _NDArray:
 553  """Returns the YUV image/video mapped to RGB [0,1] color space."""
 554  yuv = to_float01(yuv)
 555  if yuv.shape[-1] != 3:
 556    raise ValueError(f'The last dimension in {yuv.shape} is not 3.')
 557  return (yuv - _YUV_CHROMA_OFFSET) @ _RGB_FROM_YUV_MATRIX
 558
 559
 560# Same matrix values as in
 561# https://github.com/scikit-image/scikit-image/blob/master/skimage/color/colorconv.py#L1654
 562# and https://en.wikipedia.org/wiki/YUV#Studio_swing_for_BT.601
 563_YCBCR_FROM_RGB_MATRIX = np.array(
 564    [
 565        [65.481, 128.553, 24.966],
 566        [-37.797, -74.203, 112.0],
 567        [112.0, -93.786, -18.214],
 568    ],
 569    dtype=np.float32,
 570).transpose()
 571_RGB_FROM_YCBCR_MATRIX = np.linalg.inv(_YCBCR_FROM_RGB_MATRIX)
 572_YCBCR_OFFSET = np.array([16.0, 128.0, 128.0], dtype=np.float32)
 573# Note that _YCBCR_FROM_RGB_MATRIX =~ _YUV_FROM_RGB_MATRIX * [219, 256, 182];
 574# https://en.wikipedia.org/wiki/YUV: "Y' values are conventionally shifted and
 575# scaled to the range [16, 235] (referred to as studio swing or 'TV levels')";
 576# "studio range of 16-240 for U and V".  (Where does value 182 come from?)
 577
 578
 579def ycbcr_from_rgb(rgb: _ArrayLike) -> _NDArray:
 580  """Returns the RGB image/video mapped to YCbCr [0,1] color space.
 581
 582  The YCbCr color space is the one called "YUV" by video compressors.
 583
 584  Args:
 585    rgb: Input image in sRGB space.
 586  """
 587  rgb = to_float01(rgb)
 588  if rgb.shape[-1] != 3:
 589    raise ValueError(f'The last dimension in {rgb.shape} is not 3.')
 590  return (rgb @ _YCBCR_FROM_RGB_MATRIX + _YCBCR_OFFSET) / 255.0
 591
 592
 593def rgb_from_ycbcr(ycbcr: _ArrayLike) -> _NDArray:
 594  """Returns the YCbCr image/video mapped to RGB [0,1] color space."""
 595  ycbcr = to_float01(ycbcr)
 596  if ycbcr.shape[-1] != 3:
 597    raise ValueError(f'The last dimension in {ycbcr.shape} is not 3.')
 598  return (ycbcr * 255.0 - _YCBCR_OFFSET) @ _RGB_FROM_YCBCR_MATRIX
 599
 600
 601# ** Image processing.
 602
 603
 604def _pil_image(image: _ArrayLike, mode: str | None = None) -> PIL.Image.Image:
 605  """Returns a PIL image given a numpy matrix (either uint8 or float [0,1])."""
 606  image = _as_valid_media_array(image)
 607  if image.ndim not in (2, 3):
 608    raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.')
 609  pil_image: PIL.Image.Image = PIL.Image.fromarray(image, mode=mode)  # type: ignore[no-untyped-call]
 610  return pil_image
 611
 612
 613def resize_image(image: _ArrayLike, shape: tuple[int, int]) -> _NDArray:
 614  """Resizes image to specified spatial dimensions using a Lanczos filter.
 615
 616  Args:
 617    image: Array-like 2D or 3D object, where dtype is uint or floating-point.
 618    shape: 2D spatial dimensions (height, width) of output image.
 619
 620  Returns:
 621    A resampled image whose spatial dimensions match `shape`.
 622  """
 623  image = _as_valid_media_array(image)
 624  if image.ndim not in (2, 3):
 625    raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.')
 626  _check_2d_shape(shape)
 627
 628  # A PIL image can be multichannel only if it has 3 or 4 uint8 channels,
 629  # and it can be resized only if it is uint8 or float32.
 630  supported_single_channel = (
 631      np.issubdtype(image.dtype, np.floating) or image.dtype == np.uint8
 632  ) and image.ndim == 2
 633  supported_multichannel = (
 634      image.dtype == np.uint8 and image.ndim == 3 and image.shape[2] in (3, 4)
 635  )
 636  if supported_single_channel or supported_multichannel:
 637    return np.array(
 638        _pil_image(image).resize(
 639            shape[::-1], resample=PIL.Image.Resampling.LANCZOS
 640        ),
 641        dtype=image.dtype,
 642    )
 643  if image.ndim == 2:
 644    # We convert to floating-point for resizing and convert back.
 645    return to_type(resize_image(to_float01(image), shape), image.dtype)
 646  # We resize each image channel individually.
 647  return np.dstack(
 648      [resize_image(channel, shape) for channel in np.moveaxis(image, -1, 0)]
 649  )
 650
 651
 652# ** Video processing.
 653
 654
 655def resize_video(video: Iterable[_NDArray], shape: tuple[int, int]) -> _NDArray:
 656  """Resizes `video` to specified spatial dimensions using a Lanczos filter.
 657
 658  Args:
 659    video: Iterable of images.
 660    shape: 2D spatial dimensions (height, width) of output video.
 661
 662  Returns:
 663    A resampled video whose spatial dimensions match `shape`.
 664  """
 665  _check_2d_shape(shape)
 666  return np.array([resize_image(image, shape) for image in video])
 667
 668
 669# ** General I/O.
 670
 671
 672def _is_url(path_or_url: _Path) -> bool:
 673  return isinstance(path_or_url, str) and path_or_url.startswith(
 674      ('http://', 'https://', 'file://')
 675  )
 676
 677
 678def read_contents(path_or_url: _Path) -> bytes:
 679  """Returns the contents of the file specified by either a path or URL."""
 680  data: bytes
 681  if _is_url(path_or_url):
 682    assert isinstance(path_or_url, str)
 683    with urllib.request.urlopen(path_or_url) as response:
 684      data = response.read()
 685  else:
 686    with _open(path_or_url, 'rb') as f:
 687      data = f.read()
 688  return data
 689
 690
 691@contextlib.contextmanager
 692def _read_via_local_file(path_or_url: _Path) -> Iterator[str]:
 693  """Context to copy a remote file locally to read from it.
 694
 695  Args:
 696    path_or_url: File, which may be remote.
 697
 698  Yields:
 699    The name of a local file which may be a copy of a remote file.
 700  """
 701  if _is_url(path_or_url) or not _path_is_local(path_or_url):
 702    suffix = pathlib.Path(path_or_url).suffix
 703    with tempfile.TemporaryDirectory() as directory_name:
 704      tmp_path = pathlib.Path(directory_name) / f'file{suffix}'
 705      tmp_path.write_bytes(read_contents(path_or_url))
 706      yield str(tmp_path)
 707  else:
 708    yield str(path_or_url)
 709
 710
 711@contextlib.contextmanager
 712def _write_via_local_file(path: _Path) -> Iterator[str]:
 713  """Context to write a temporary local file and subsequently copy it remotely.
 714
 715  Args:
 716    path: File, which may be remote.
 717
 718  Yields:
 719    The name of a local file which may be subsequently copied remotely.
 720  """
 721  if _path_is_local(path):
 722    yield str(path)
 723  else:
 724    suffix = pathlib.Path(path).suffix
 725    with tempfile.TemporaryDirectory() as directory_name:
 726      tmp_path = pathlib.Path(directory_name) / f'file{suffix}'
 727      yield str(tmp_path)
 728      with _open(path, mode='wb') as f:
 729        f.write(tmp_path.read_bytes())
 730
 731
 732class set_show_save_dir:  # pylint: disable=invalid-name
 733  """Save all titled output from `show_*()` calls into files.
 734
 735  If the specified `directory` is not None, all titled images and videos
 736  displayed by `show_image`, `show_images`, `show_video`, and `show_videos` are
 737  also saved as files within the directory.
 738
 739  It can be used either to set the state or as a context manager:
 740
 741  >>> set_show_save_dir('/tmp')
 742  >>> show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
 743  >>> show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
 744  >>> set_show_save_dir(None)
 745
 746  >>> with set_show_save_dir('/tmp'):
 747  ...   show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
 748  ...   show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
 749  """
 750
 751  def __init__(self, directory: _Path | None):
 752    self._old_show_save_dir = _config.show_save_dir
 753    _config.show_save_dir = directory
 754
 755  def __enter__(self) -> None:
 756    pass
 757
 758  def __exit__(self, *_: Any) -> None:
 759    _config.show_save_dir = self._old_show_save_dir
 760
 761
 762# ** Image I/O.
 763
 764
 765def read_image(
 766    path_or_url: _Path,
 767    *,
 768    apply_exif_transpose: bool = True,
 769    dtype: _DTypeLike = None,
 770) -> _NDArray:
 771  """Returns an image read from a file path or URL.
 772
 773  Decoding is performed using `PIL`, which supports `uint8` images with 1, 3,
 774  or 4 channels and `uint16` images with a single channel.
 775
 776  Args:
 777    path_or_url: Path of input file.
 778    apply_exif_transpose: If True, rotate image according to EXIF orientation.
 779    dtype: Data type of the returned array.  If None, `np.uint8` or `np.uint16`
 780      is inferred automatically.
 781  """
 782  data = read_contents(path_or_url)
 783  return decompress_image(data, dtype, apply_exif_transpose)
 784
 785
 786def write_image(
 787    path: _Path, image: _ArrayLike, fmt: str = 'png', **kwargs: Any
 788) -> None:
 789  """Writes an image to a file.
 790
 791  Encoding is performed using `PIL`, which supports `uint8` images with 1, 3,
 792  or 4 channels and `uint16` images with a single channel.
 793
 794  File format is explicitly provided by `fmt` and not inferred by `path`.
 795
 796  Args:
 797    path: Path of output file.
 798    image: Array-like object.  If its type is float, it is converted to np.uint8
 799      using `to_uint8` (thus clamping to the input to the range [0.0, 1.0]).
 800      Otherwise it must be np.uint8 or np.uint16.
 801    fmt: Desired compression encoding, e.g. 'png'.
 802    **kwargs: Additional parameters for `PIL.Image.save()`.
 803  """
 804  image = _as_valid_media_array(image)
 805  if np.issubdtype(image.dtype, np.floating):
 806    image = to_uint8(image)
 807  with _open(path, 'wb') as f:
 808    _pil_image(image).save(f, format=fmt, **kwargs)
 809
 810
 811def to_rgb(
 812    array: _ArrayLike,
 813    *,
 814    vmin: float | None = None,
 815    vmax: float | None = None,
 816    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
 817) -> _NDArray:
 818  """Maps scalar values to RGB using value bounds and a color map.
 819
 820  Args:
 821    array: Scalar values, with arbitrary shape.
 822    vmin: Explicit min value for remapping; if None, it is obtained as the
 823      minimum finite value of `array`.
 824    vmax: Explicit max value for remapping; if None, it is obtained as the
 825      maximum finite value of `array`.
 826    cmap: A `pyplot` color map or callable, to map from 1D value to 3D or 4D
 827      color.
 828
 829  Returns:
 830    A new array in which each element is affinely mapped from [vmin, vmax]
 831    to [0.0, 1.0] and then color-mapped.
 832  """
 833  a = _as_valid_media_array(array)
 834  del array
 835  # For future numpy version 1.7.0:
 836  # vmin = np.amin(a, where=np.isfinite(a)) if vmin is None else vmin
 837  # vmax = np.amax(a, where=np.isfinite(a)) if vmax is None else vmax
 838  vmin = np.amin(np.where(np.isfinite(a), a, np.inf)) if vmin is None else vmin
 839  vmax = np.amax(np.where(np.isfinite(a), a, -np.inf)) if vmax is None else vmax
 840  a = (a.astype('float') - vmin) / (vmax - vmin + np.finfo(float).eps)
 841  if isinstance(cmap, str):
 842    if hasattr(matplotlib, 'colormaps'):
 843      rgb_from_scalar: Any = matplotlib.colormaps[cmap]  # Newer version.
 844    else:
 845      rgb_from_scalar = matplotlib.pyplot.cm.get_cmap(cmap)  # type: ignore # pylint: disable=no-member
 846  else:
 847    rgb_from_scalar = cmap
 848  a = rgb_from_scalar(a)
 849  # If there is a fully opaque alpha channel, remove it.
 850  if a.shape[-1] == 4 and np.all(to_float01(a[..., 3])) == 1.0:
 851    a = a[..., :3]
 852  return a
 853
 854
 855def compress_image(
 856    image: _ArrayLike, *, fmt: str = 'png', **kwargs: Any
 857) -> bytes:
 858  """Returns a buffer containing a compressed image.
 859
 860  Args:
 861    image: Array in a format supported by `PIL`, e.g. np.uint8 or np.uint16.
 862    fmt: Desired compression encoding, e.g. 'png'.
 863    **kwargs: Options for `PIL.save()`, e.g. `optimize=True` for greater
 864      compression.
 865  """
 866  image = _as_valid_media_array(image)
 867  with io.BytesIO() as output:
 868    _pil_image(image).save(output, format=fmt, **kwargs)
 869    return output.getvalue()
 870
 871
 872def decompress_image(
 873    data: bytes, dtype: _DTypeLike = None, apply_exif_transpose: bool = True
 874) -> _NDArray:
 875  """Returns an image from a compressed data buffer.
 876
 877  Decoding is performed using `PIL`, which supports `uint8` images with 1, 3,
 878  or 4 channels and `uint16` images with a single channel.
 879
 880  Args:
 881    data: Buffer containing compressed image.
 882    dtype: Data type of the returned array.  If None, `np.uint8` or `np.uint16`
 883      is inferred automatically.
 884    apply_exif_transpose: If True, rotate image according to EXIF orientation.
 885  """
 886  pil_image = PIL.Image.open(io.BytesIO(data))
 887  if apply_exif_transpose:
 888    tmp_image = PIL.ImageOps.exif_transpose(pil_image)  # Future: in_place=True.
 889    assert tmp_image
 890    pil_image = tmp_image
 891  if dtype is None:
 892    dtype = np.uint16 if pil_image.mode.startswith('I') else np.uint8
 893  return np.array(pil_image, dtype=dtype)
 894
 895
 896def html_from_compressed_image(
 897    data: bytes,
 898    width: int,
 899    height: int,
 900    *,
 901    title: str | None = None,
 902    border: bool | str = False,
 903    pixelated: bool = True,
 904    fmt: str = 'png',
 905) -> str:
 906  """Returns an HTML string with an image tag containing encoded data.
 907
 908  Args:
 909    data: Compressed image bytes.
 910    width: Width of HTML image in pixels.
 911    height: Height of HTML image in pixels.
 912    title: Optional text shown centered above image.
 913    border: If `bool`, whether to place a black boundary around the image, or if
 914      `str`, the boundary CSS style.
 915    pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'.
 916    fmt: Compression encoding.
 917  """
 918  b64 = base64.b64encode(data).decode('utf-8')
 919  if isinstance(border, str):
 920    border = f'{border}; '
 921  elif border:
 922    border = 'border:1px solid black; '
 923  else:
 924    border = ''
 925  s_pixelated = 'pixelated' if pixelated else 'auto'
 926  s = (
 927      f'<img width="{width}" height="{height}"'
 928      f' style="{border}image-rendering:{s_pixelated}; object-fit:cover;"'
 929      f' src="data:image/{fmt};base64,{b64}"/>'
 930  )
 931  if title is not None:
 932    s = f"""<div style="display:flex; align-items:left;">
 933      <div style="display:flex; flex-direction:column; align-items:center;">
 934      <div>{title}</div><div>{s}</div></div></div>"""
 935  return s
 936
 937
 938def _get_width_height(
 939    width: int | None, height: int | None, shape: tuple[int, int]
 940) -> tuple[int, int]:
 941  """Returns (width, height) given optional parameters and image shape."""
 942  assert len(shape) == 2, shape
 943  if width and height:
 944    return width, height
 945  if width and not height:
 946    return width, int(width * (shape[0] / shape[1]) + 0.5)
 947  if height and not width:
 948    return int(height * (shape[1] / shape[0]) + 0.5), height
 949  return shape[::-1]
 950
 951
 952def _ensure_mapped_to_rgb(
 953    image: _ArrayLike,
 954    *,
 955    vmin: float | None = None,
 956    vmax: float | None = None,
 957    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
 958) -> _NDArray:
 959  """Ensure image is mapped to RGB."""
 960  image = _as_valid_media_array(image)
 961  if not (image.ndim == 2 or (image.ndim == 3 and image.shape[2] in (1, 3, 4))):
 962    raise ValueError(
 963        f'Image with shape {image.shape} is neither a 2D array'
 964        ' nor a 3D array with 1, 3, or 4 channels.'
 965    )
 966  if image.ndim == 3 and image.shape[2] == 1:
 967    image = image[:, :, 0]
 968  if image.ndim == 2:
 969    image = to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
 970  return image
 971
 972
 973def show_image(
 974    image: _ArrayLike, *, title: str | None = None, **kwargs: Any
 975) -> str | None:
 976  """Displays an image in the notebook and optionally saves it to a file.
 977
 978  See `show_images`.
 979
 980  >>> show_image(np.random.rand(100, 100))
 981  >>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8'))
 982  >>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100)
 983  >>> show_image(read_image('/tmp/image.png'))
 984  >>> url = 'https://github.com/hhoppe/data/raw/main/image.png'
 985  >>> show_image(read_image(url))
 986
 987  Args:
 988    image: 2D array-like, or 3D array-like with 1, 3, or 4 channels.
 989    title: Optional text shown centered above the image.
 990    **kwargs: See `show_images`.
 991
 992  Returns:
 993    html string if `return_html` is `True`.
 994  """
 995  return show_images([np.asarray(image)], [title], **kwargs)
 996
 997
 998def show_images(
 999    images: Iterable[_ArrayLike] | Mapping[str, _ArrayLike],
1000    titles: Iterable[str | None] | None = None,
1001    *,
1002    width: int | None = None,
1003    height: int | None = None,
1004    downsample: bool = True,
1005    columns: int | None = None,
1006    vmin: float | None = None,
1007    vmax: float | None = None,
1008    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
1009    border: bool | str = False,
1010    ylabel: str = '',
1011    html_class: str = 'show_images',
1012    pixelated: bool | None = None,
1013    return_html: bool = False,
1014) -> str | None:
1015  """Displays a row of images in the IPython/Jupyter notebook.
1016
1017  If a directory has been specified using `set_show_save_dir`, also saves each
1018  titled image to a file in that directory based on its title.
1019
1020  >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
1021  >>> show_images([image1, image2])
1022  >>> show_images({'random image': image1, 'color ramp': image2}, height=128)
1023  >>> show_images([image1, image2] * 5, columns=4, border=True)
1024
1025  Args:
1026    images: Iterable of images, or dictionary of `{title: image}`.  Each image
1027      must be either a 2D array or a 3D array with 1, 3, or 4 channels.
1028    titles: Optional strings shown above the corresponding images.
1029    width: Optional, overrides displayed width (in pixels).
1030    height: Optional, overrides displayed height (in pixels).
1031    downsample: If True, each image whose width or height is greater than the
1032      specified `width` or `height` is resampled to the display resolution. This
1033      improves antialiasing and reduces the size of the notebook.
1034    columns: Optional, maximum number of images per row.
1035    vmin: For single-channel image, explicit min value for display.
1036    vmax: For single-channel image, explicit max value for display.
1037    cmap: For single-channel image, `pyplot` color map or callable to map 1D to
1038      3D color.
1039    border: If `bool`, whether to place a black boundary around the image, or if
1040      `str`, the boundary CSS style.
1041    ylabel: Text (rotated by 90 degrees) shown on the left of each row.
1042    html_class: CSS class name used in definition of HTML element.
1043    pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if
1044      False, sets 'image-rendering: auto'; if None, uses pixelated rendering
1045      only on images for which `width` or `height` introduces magnification.
1046    return_html: If `True` return the raw HTML `str` instead of displaying.
1047
1048  Returns:
1049    html string if `return_html` is `True`.
1050  """
1051  if isinstance(images, Mapping):
1052    if titles is not None:
1053      raise ValueError('Cannot have images dictionary and titles parameter.')
1054    list_titles, list_images = list(images.keys()), list(images.values())
1055  else:
1056    list_images = list(images)
1057    list_titles = [None] * len(list_images) if titles is None else list(titles)
1058    if len(list_images) != len(list_titles):
1059      raise ValueError(
1060          'Number of images does not match number of titles'
1061          f' ({len(list_images)} vs {len(list_titles)}).'
1062      )
1063
1064  list_images = [
1065      _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
1066      for image in list_images
1067  ]
1068
1069  def maybe_downsample(image: _NDArray) -> _NDArray:
1070    shape: tuple[int, int] = image.shape[:2]  # type: ignore[assignment]
1071    w, h = _get_width_height(width, height, shape)
1072    if w < shape[1] or h < shape[0]:
1073      image = resize_image(image, (h, w))
1074    return image
1075
1076  if downsample:
1077    list_images = [maybe_downsample(image) for image in list_images]
1078  png_datas = [compress_image(to_uint8(image)) for image in list_images]
1079
1080  for title, png_data in zip(list_titles, png_datas):
1081    if title is not None and _config.show_save_dir:
1082      path = pathlib.Path(_config.show_save_dir) / f'{title}.png'
1083      with _open(path, mode='wb') as f:
1084        f.write(png_data)
1085
1086  def html_from_compressed_images() -> str:
1087    html_strings = []
1088    for image, title, png_data in zip(list_images, list_titles, png_datas):
1089      w, h = _get_width_height(width, height, image.shape[:2])
1090      magnified = h > image.shape[0] or w > image.shape[1]
1091      pixelated2 = pixelated if pixelated is not None else magnified
1092      html_strings.append(
1093          html_from_compressed_image(
1094              png_data, w, h, title=title, border=border, pixelated=pixelated2
1095          )
1096      )
1097    # Create single-row tables each with no more than 'columns' elements.
1098    table_strings = []
1099    for row_html_strings in _chunked(html_strings, columns):
1100      td = '<td style="padding:1px;">'
1101      s = ''.join(f'{td}{e}</td>' for e in row_html_strings)
1102      if ylabel:
1103        style = 'writing-mode:vertical-lr; transform:rotate(180deg);'
1104        s = f'{td}<span style="{style}">{ylabel}</span></td>' + s
1105      table_strings.append(
1106          f'<table class="{html_class}"'
1107          f' style="border-spacing:0px;"><tr>{s}</tr></table>'
1108      )
1109    return ''.join(table_strings)
1110
1111  s = html_from_compressed_images()
1112  while len(s) > _IPYTHON_HTML_SIZE_LIMIT * 0.5:
1113    list_images = [image[::2, ::2] for image in list_images]
1114    png_datas = [compress_image(to_uint8(image)) for image in list_images]
1115    s = html_from_compressed_images()
1116  if return_html:
1117    return s
1118  _display_html(s)
1119  return None
1120
1121
1122def compare_images(
1123    images: Iterable[_ArrayLike],
1124    *,
1125    vmin: float | None = None,
1126    vmax: float | None = None,
1127    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
1128) -> None:
1129  """Compare two images using an interactive slider.
1130
1131  Displays an HTML slider component to interactively swipe between two images.
1132  The slider functionality requires that the web browser have Internet access.
1133  See additional info in `https://github.com/sneas/img-comparison-slider`.
1134
1135  >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
1136  >>> compare_images([image1, image2])
1137
1138  Args:
1139    images: Iterable of images.  Each image must be either a 2D array or a 3D
1140      array with 1, 3, or 4 channels.  There must be exactly two images.
1141    vmin: For single-channel image, explicit min value for display.
1142    vmax: For single-channel image, explicit max value for display.
1143    cmap: For single-channel image, `pyplot` color map or callable to map 1D to
1144      3D color.
1145  """
1146  list_images = [
1147      _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
1148      for image in images
1149  ]
1150  if len(list_images) != 2:
1151    raise ValueError('The number of images must be 2.')
1152  png_datas = [compress_image(to_uint8(image)) for image in list_images]
1153  b64_1, b64_2 = [
1154      base64.b64encode(png_data).decode('utf-8') for png_data in png_datas
1155  ]
1156  s = _IMAGE_COMPARISON_HTML.replace('{b64_1}', b64_1).replace('{b64_2}', b64_2)
1157  _display_html(s)
1158
1159
1160# ** Video I/O.
1161
1162
1163def _filename_suffix_from_codec(codec: str) -> str:
1164  return '.gif' if codec == 'gif' else '.mp4'
1165
1166
1167def _get_ffmpeg_path() -> str:
1168  path = _search_for_ffmpeg_path()
1169  if not path:
1170    raise RuntimeError(
1171        f"Program '{_config.ffmpeg_name_or_path}' is not found;"
1172        " perhaps install ffmpeg using 'apt install ffmpeg'."
1173    )
1174  return path
1175
1176
1177def video_is_available() -> bool:
1178  """Returns True if the program `ffmpeg` is found.
1179
1180  See also `set_ffmpeg`.
1181  """
1182  return _search_for_ffmpeg_path() is not None
1183
1184
1185class VideoMetadata(NamedTuple):
1186  """Represents the data stored in a video container header.
1187
1188  Attributes:
1189    num_images: Number of frames that is expected from the video stream.  This
1190      is estimated from the framerate and the duration stored in the video
1191      header, so it might be inexact.  We set the value to -1 if number of
1192      frames is not found in the header.
1193    shape: The dimensions (height, width) of each video frame.
1194    fps: The framerate in frames per second.
1195    bps: The estimated bitrate of the video stream in bits per second, retrieved
1196      from the video header.
1197  """
1198
1199  num_images: int
1200  shape: tuple[int, int]
1201  fps: float
1202  bps: int | None
1203
1204
1205def _get_video_metadata(path: _Path) -> VideoMetadata:
1206  """Returns attributes of video stored in the specified local file."""
1207  if not pathlib.Path(path).is_file():
1208    raise RuntimeError(f"Video file '{path}' is not found.")
1209  command = [
1210      _get_ffmpeg_path(),
1211      '-nostdin',
1212      '-i',
1213      str(path),
1214      '-acodec',
1215      'copy',
1216      '-vcodec',
1217      'copy',
1218      '-f',
1219      'null',
1220      '-',
1221  ]
1222  with subprocess.Popen(
1223      command, stderr=subprocess.PIPE, encoding='utf-8'
1224  ) as proc:
1225    _, err = proc.communicate()
1226  bps = fps = num_images = width = height = rotation = None
1227  for line in err.split('\n'):
1228    if match := re.search(r', bitrate: *([\d.]+) kb/s', line):
1229      bps = int(match.group(1)) * 1000
1230    if matches := re.findall(r'frame= *(\d+) ', line):
1231      num_images = int(matches[-1])
1232    if 'Stream #0:' in line and ': Video:' in line:
1233      if not (match := re.search(r', (\d+)x(\d+)', line)):
1234        raise RuntimeError(f'Unable to parse video dimensions in line {line}')
1235      width, height = int(match.group(1)), int(match.group(2))
1236      if match := re.search(r', ([\d.]+) fps', line):
1237        fps = float(match.group(1))
1238      elif str(path).endswith('.gif'):
1239        # Some GIF files lack a framerate attribute; use a reasonable default.
1240        fps = 10
1241      else:
1242        raise RuntimeError(f'Unable to parse video framerate in line {line}')
1243    if match := re.fullmatch(r'\s*rotate\s*:\s*(\d+)', line):
1244      rotation = int(match.group(1))
1245    if match := re.fullmatch(r'.*rotation of (-?\d+).*\sdegrees', line):
1246      rotation = int(match.group(1))
1247  if not num_images:
1248    num_images = -1
1249  if not width:
1250    raise RuntimeError(f'Unable to parse video header: {err}')
1251  # By default, ffmpeg enables "-autorotate"; we just fix the dimensions.
1252  if rotation in (90, 270, -90, -270):
1253    width, height = height, width
1254  assert height is not None and width is not None
1255  shape = height, width
1256  assert fps is not None
1257  return VideoMetadata(num_images, shape, fps, bps)
1258
1259
1260class _VideoIO:
1261  """Base class for `VideoReader` and `VideoWriter`."""
1262
1263  def _get_pix_fmt(self, dtype: _DType, image_format: str) -> str:
1264    """Returns ffmpeg pix_fmt given data type and image format."""
1265    native_endian_suffix = {'little': 'le', 'big': 'be'}[sys.byteorder]
1266    return {
1267        np.uint8: {
1268            'rgb': 'rgb24',
1269            'yuv': 'yuv444p',
1270            'gray': 'gray',
1271        },
1272        np.uint16: {
1273            'rgb': 'rgb48' + native_endian_suffix,
1274            'yuv': 'yuv444p16' + native_endian_suffix,
1275            'gray': 'gray16' + native_endian_suffix,
1276        },
1277    }[dtype.type][image_format]
1278
1279
1280class VideoReader(_VideoIO):
1281  """Context to read a compressed video as an iterable over its images.
1282
1283  >>> with VideoReader('/tmp/river.mp4') as reader:
1284  ...   print(f'Video has {reader.num_images} images with shape={reader.shape},'
1285  ...         f' at {reader.fps} frames/sec and {reader.bps} bits/sec.')
1286  ...   for image in reader:
1287  ...     print(image.shape)
1288
1289  >>> with VideoReader('/tmp/river.mp4') as reader:
1290  ...   video = np.array(tuple(reader))
1291
1292  >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
1293  >>> with VideoReader(url) as reader:
1294  ...   show_video(reader)
1295
1296  Attributes:
1297    path_or_url: Location of input video.
1298    output_format: Format of output images (default 'rgb').  If 'rgb', each
1299      image has shape=(height, width, 3) with R, G, B values.  If 'yuv', each
1300      image has shape=(height, width, 3) with Y, U, V values.  If 'gray', each
1301      image has shape=(height, width).
1302    dtype: Data type for output images.  The default is `np.uint8`.  Use of
1303      `np.uint16` allows reading 10-bit or 12-bit data without precision loss.
1304    metadata: Object storing the information retrieved from the video header.
1305      Its attributes are copied as attributes in this class.
1306    num_images: Number of frames that is expected from the video stream.  This
1307      is estimated from the framerate and the duration stored in the video
1308      header, so it might be inexact.
1309    shape: The dimensions (height, width) of each video frame.
1310    fps: The framerate in frames per second.
1311    bps: The estimated bitrate of the video stream in bits per second, retrieved
1312      from the video header.
1313  """
1314
1315  path_or_url: _Path
1316  output_format: str
1317  dtype: _DType
1318  metadata: VideoMetadata
1319  num_images: int
1320  shape: tuple[int, int]
1321  fps: float
1322  bps: int | None
1323  _num_bytes_per_image: int
1324
1325  def __init__(
1326      self,
1327      path_or_url: _Path,
1328      *,
1329      output_format: str = 'rgb',
1330      dtype: _DTypeLike = np.uint8,
1331  ):
1332    if output_format not in {'rgb', 'yuv', 'gray'}:
1333      raise ValueError(
1334          f'Output format {output_format} is not rgb, yuv, or gray.'
1335      )
1336    self.path_or_url = path_or_url
1337    self.output_format = output_format
1338    self.dtype = np.dtype(dtype)
1339    if self.dtype.type not in (np.uint8, np.uint16):
1340      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1341    self._read_via_local_file: Any = None
1342    self._popen: subprocess.Popen[bytes] | None = None
1343    self._proc: subprocess.Popen[bytes] | None = None
1344
1345  def __enter__(self) -> 'VideoReader':
1346    ffmpeg_path = _get_ffmpeg_path()
1347    try:
1348      self._read_via_local_file = _read_via_local_file(self.path_or_url)
1349      # pylint: disable-next=no-member
1350      tmp_name = self._read_via_local_file.__enter__()
1351
1352      self.metadata = _get_video_metadata(tmp_name)
1353      self.num_images, self.shape, self.fps, self.bps = self.metadata
1354      pix_fmt = self._get_pix_fmt(self.dtype, self.output_format)
1355      num_channels = {'rgb': 3, 'yuv': 3, 'gray': 1}[self.output_format]
1356      bytes_per_channel = self.dtype.itemsize
1357      self._num_bytes_per_image = (
1358          math.prod(self.shape) * num_channels * bytes_per_channel
1359      )
1360
1361      command = [
1362          ffmpeg_path,
1363          '-v',
1364          'panic',
1365          '-nostdin',
1366          '-i',
1367          tmp_name,
1368          '-vcodec',
1369          'rawvideo',
1370          '-f',
1371          'image2pipe',
1372          '-pix_fmt',
1373          pix_fmt,
1374          '-vsync',
1375          'vfr',
1376          '-',
1377      ]
1378      self._popen = subprocess.Popen(
1379          command, stdout=subprocess.PIPE, stderr=subprocess.PIPE
1380      )
1381      self._proc = self._popen.__enter__()
1382    except Exception:
1383      self.__exit__(None, None, None)
1384      raise
1385    return self
1386
1387  def __exit__(self, *_: Any) -> None:
1388    self.close()
1389
1390  def read(self) -> _NDArray | None:
1391    """Reads a video image frame (or None if at end of file).
1392
1393    Returns:
1394      A numpy array in the format specified by `output_format`, i.e., a 3D
1395      array with 3 color channels, except for format 'gray' which is 2D.
1396    """
1397    assert self._proc, 'Error: reading from an already closed context.'
1398    stdout = self._proc.stdout
1399    assert stdout is not None
1400    data = stdout.read(self._num_bytes_per_image)
1401    if not data:  # Due to either end-of-file or subprocess error.
1402      self.close()  # Raises exception if subprocess had error.
1403      return None  # To indicate end-of-file.
1404    assert len(data) == self._num_bytes_per_image
1405    image = np.frombuffer(data, dtype=self.dtype)
1406    if self.output_format == 'rgb':
1407      image = image.reshape(*self.shape, 3)
1408    elif self.output_format == 'yuv':  # Convert from planar YUV to pixel YUV.
1409      image = np.moveaxis(image.reshape(3, *self.shape), 0, 2)
1410    elif self.output_format == 'gray':  # Generate 2D rather than 3D ndimage.
1411      image = image.reshape(*self.shape)
1412    else:
1413      raise AssertionError
1414    return image
1415
1416  def __iter__(self) -> Iterator[_NDArray]:
1417    while True:
1418      image = self.read()
1419      if image is None:
1420        return
1421      yield image
1422
1423  def close(self) -> None:
1424    """Terminates video reader.  (Called automatically at end of context.)"""
1425    if self._popen:
1426      self._popen.__exit__(None, None, None)
1427      self._popen = None
1428      self._proc = None
1429    if self._read_via_local_file:
1430      # pylint: disable-next=no-member
1431      self._read_via_local_file.__exit__(None, None, None)
1432      self._read_via_local_file = None
1433
1434
1435class VideoWriter(_VideoIO):
1436  """Context to write a compressed video.
1437
1438  >>> shape = 480, 640
1439  >>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer:
1440  ...   for image in moving_circle(shape, num_images=60):
1441  ...     writer.add_image(image)
1442  >>> show_video(read_video('/tmp/v.mp4'))
1443
1444
1445  Bitrate control may be specified using at most one of: `bps`, `qp`, or `crf`.
1446  If none are specified, `qp` is set to a default value.
1447  See https://slhck.info/video/2017/03/01/rate-control.html
1448
1449  If codec is 'gif', the args `bps`, `qp`, `crf`, and `encoded_format` are
1450  ignored.
1451
1452  Attributes:
1453    path: Output video.  Its suffix (e.g. '.mp4') determines the video container
1454      format.  The suffix must be '.gif' if the codec is 'gif'.
1455    shape: 2D spatial dimensions (height, width) of video image frames.  The
1456      dimensions must be even if 'encoded_format' has subsampled chroma (e.g.,
1457      'yuv420p' or 'yuv420p10le').
1458    codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264',
1459      'hevc', 'vp9', or 'gif').
1460    metadata: Optional VideoMetadata object whose `fps` and `bps` attributes are
1461      used if not specified as explicit parameters.
1462    fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif').
1463    bps: Requested average bits-per-second bitrate (default None).
1464    qp: Quantization parameter for video compression quality (default None).
1465    crf: Constant rate factor for video compression quality (default None).
1466    ffmpeg_args: Additional arguments for `ffmpeg` command, e.g. '-g 30' to
1467      introduce I-frames, or '-bf 0' to omit B-frames.
1468    input_format: Format of input images (default 'rgb').  If 'rgb', each image
1469      has shape=(height, width, 3) or (height, width).  If 'yuv', each image has
1470      shape=(height, width, 3) with Y, U, V values.  If 'gray', each image has
1471      shape=(height, width).
1472    dtype: Expected data type for input images (any float input images are
1473      converted to `dtype`).  The default is `np.uint8`.  Use of `np.uint16` is
1474      necessary when encoding >8 bits/channel.
1475    encoded_format: Pixel format as defined by `ffmpeg -pix_fmts`, e.g.,
1476      'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma),
1477      'yuv420p10le' (10-bit per channel), etc.  The default (None) selects
1478      'yuv420p' if all shape dimensions are even, else 'yuv444p'.
1479  """
1480
1481  def __init__(
1482      self,
1483      path: _Path,
1484      shape: tuple[int, int],
1485      *,
1486      codec: str = 'h264',
1487      metadata: VideoMetadata | None = None,
1488      fps: float | None = None,
1489      bps: int | None = None,
1490      qp: int | None = None,
1491      crf: float | None = None,
1492      ffmpeg_args: str | Sequence[str] = '',
1493      input_format: str = 'rgb',
1494      dtype: _DTypeLike = np.uint8,
1495      encoded_format: str | None = None,
1496  ) -> None:
1497    _check_2d_shape(shape)
1498    if fps is None and metadata:
1499      fps = metadata.fps
1500    if fps is None:
1501      fps = 25.0 if codec == 'gif' else 60.0
1502    if fps <= 0.0:
1503      raise ValueError(f'Frame-per-second value {fps} is invalid.')
1504    if bps is None and metadata:
1505      bps = metadata.bps
1506    bps = int(bps) if bps is not None else None
1507    if bps is not None and bps <= 0:
1508      raise ValueError(f'Bitrate value {bps} is invalid.')
1509    if qp is not None and (not isinstance(qp, int) or qp <= 0):
1510      raise ValueError(
1511          f'Quantization parameter {qp} is not a positive integer.'
1512      )
1513    num_rate_specifications = sum(x is not None for x in (bps, qp, crf))
1514    if num_rate_specifications > 1:
1515      raise ValueError(
1516          f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).'
1517      )
1518    ffmpeg_args = (
1519        shlex.split(ffmpeg_args)
1520        if isinstance(ffmpeg_args, str)
1521        else list(ffmpeg_args)
1522    )
1523    if input_format not in {'rgb', 'yuv', 'gray'}:
1524      raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.')
1525    dtype = np.dtype(dtype)
1526    if dtype.type not in (np.uint8, np.uint16):
1527      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1528    self.path = pathlib.Path(path)
1529    self.shape = shape
1530    all_dimensions_are_even = all(dim % 2 == 0 for dim in shape)
1531    if encoded_format is None:
1532      encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p'
1533    if not all_dimensions_are_even and encoded_format.startswith(
1534        ('yuv42', 'yuvj42')
1535    ):
1536      raise ValueError(
1537          f'With encoded_format {encoded_format}, video dimensions must be'
1538          f' even, but shape is {shape}.'
1539      )
1540    self.fps = fps
1541    self.codec = codec
1542    self.bps = bps
1543    self.qp = qp
1544    self.crf = crf
1545    self.ffmpeg_args = ffmpeg_args
1546    self.input_format = input_format
1547    self.dtype = dtype
1548    self.encoded_format = encoded_format
1549    if num_rate_specifications == 0 and not ffmpeg_args:
1550      qp = 20 if math.prod(self.shape) <= 640 * 480 else 28
1551    self._bitrate_args = (
1552        (['-vb', f'{bps}'] if bps is not None else [])
1553        + (['-qp', f'{qp}'] if qp is not None else [])
1554        + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else [])
1555    )
1556    if self.codec == 'gif':
1557      if self.path.suffix != '.gif':
1558        raise ValueError(f"File '{self.path}' does not have a .gif suffix.")
1559      self.encoded_format = 'pal8'
1560      self._bitrate_args = []
1561      video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse'
1562      # Less common (and likely less useful) is a per-frame color palette:
1563      # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];'
1564      #                 '[s1][p]paletteuse=new=1')
1565      self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args
1566    self._write_via_local_file: Any = None
1567    self._popen: subprocess.Popen[bytes] | None = None
1568    self._proc: subprocess.Popen[bytes] | None = None
1569
1570  def __enter__(self) -> 'VideoWriter':
1571    ffmpeg_path = _get_ffmpeg_path()
1572    input_pix_fmt = self._get_pix_fmt(self.dtype, self.input_format)
1573    try:
1574      self._write_via_local_file = _write_via_local_file(self.path)
1575      # pylint: disable-next=no-member
1576      tmp_name = self._write_via_local_file.__enter__()
1577
1578      # Writing to stdout using ('-f', 'mp4', '-') would require
1579      # ('-movflags', 'frag_keyframe+empty_moov') and the result is nonportable.
1580      height, width = self.shape
1581      command = (
1582          [
1583              ffmpeg_path,
1584              '-v',
1585              'error',
1586              '-f',
1587              'rawvideo',
1588              '-vcodec',
1589              'rawvideo',
1590              '-pix_fmt',
1591              input_pix_fmt,
1592              '-s',
1593              f'{width}x{height}',
1594              '-r',
1595              f'{self.fps}',
1596              '-i',
1597              '-',
1598              '-an',
1599              '-vcodec',
1600              self.codec,
1601              '-pix_fmt',
1602              self.encoded_format,
1603          ]
1604          + self._bitrate_args
1605          + self.ffmpeg_args
1606          + ['-y', tmp_name]
1607      )
1608      self._popen = subprocess.Popen(
1609          command, stdin=subprocess.PIPE, stderr=subprocess.PIPE
1610      )
1611      self._proc = self._popen.__enter__()
1612    except Exception:
1613      self.__exit__(None, None, None)
1614      raise
1615    return self
1616
1617  def __exit__(self, *_: Any) -> None:
1618    self.close()
1619
1620  def add_image(self, image: _NDArray) -> None:
1621    """Writes a video frame.
1622
1623    Args:
1624      image: Array whose dtype and first two dimensions must match the `dtype`
1625        and `shape` specified in `VideoWriter` initialization.  If
1626        `input_format` is 'gray', the image must be 2D.  For the 'rgb'
1627        input_format, the image may be either 2D (interpreted as grayscale) or
1628        3D with three (R, G, B) channels.  For the 'yuv' input_format, the image
1629        must be 3D with three (Y, U, V) channels.
1630
1631    Raises:
1632      RuntimeError: If there is an error writing to the output file.
1633    """
1634    assert self._proc, 'Error: writing to an already closed context.'
1635    if issubclass(image.dtype.type, (np.floating, np.bool_)):
1636      image = to_type(image, self.dtype)
1637    if image.dtype != self.dtype:
1638      raise ValueError(f'Image type {image.dtype} != {self.dtype}.')
1639    if self.input_format == 'gray':
1640      if image.ndim != 2:
1641        raise ValueError(f'Image dimensions {image.shape} are not 2D.')
1642    else:
1643      if image.ndim == 2 and self.input_format == 'rgb':
1644        image = np.dstack((image, image, image))
1645      if not (image.ndim == 3 and image.shape[2] == 3):
1646        raise ValueError(f'Image dimensions {image.shape} are invalid.')
1647    if image.shape[:2] != self.shape:
1648      raise ValueError(
1649          f'Image dimensions {image.shape[:2]} do not match'
1650          f' those of the initialized video {self.shape}.'
1651      )
1652    if self.input_format == 'yuv':  # Convert from per-pixel YUV to planar YUV.
1653      image = np.moveaxis(image, 2, 0)
1654    data = image.tobytes()
1655    stdin = self._proc.stdin
1656    assert stdin is not None
1657    if stdin.write(data) != len(data):
1658      self._proc.wait()
1659      stderr = self._proc.stderr
1660      assert stderr is not None
1661      s = stderr.read().decode()
1662      raise RuntimeError(f"Error writing '{self.path}': {s}")
1663
1664  def close(self) -> None:
1665    """Finishes writing the video.  (Called automatically at end of context.)"""
1666    if self._popen:
1667      assert self._proc, 'Error: closing an already closed context.'
1668      stdin = self._proc.stdin
1669      assert stdin is not None
1670      stdin.close()
1671      if self._proc.wait():
1672        stderr = self._proc.stderr
1673        assert stderr is not None
1674        s = stderr.read().decode()
1675        raise RuntimeError(f"Error writing '{self.path}': {s}")
1676      self._popen.__exit__(None, None, None)
1677      self._popen = None
1678      self._proc = None
1679    if self._write_via_local_file:
1680      # pylint: disable-next=no-member
1681      self._write_via_local_file.__exit__(None, None, None)
1682      self._write_via_local_file = None
1683
1684
1685class _VideoArray(npt.NDArray[Any]):
1686  """Wrapper to add a VideoMetadata `metadata` attribute to a numpy array."""
1687
1688  metadata: VideoMetadata | None
1689
1690  def __new__(
1691      cls: Type['_VideoArray'],
1692      input_array: _NDArray,
1693      metadata: VideoMetadata | None = None,
1694  ) -> '_VideoArray':
1695    obj: _VideoArray = np.asarray(input_array).view(cls)
1696    obj.metadata = metadata
1697    return obj
1698
1699  def __array_finalize__(self, obj: Any) -> None:
1700    if obj is None:
1701      return
1702    self.metadata = getattr(obj, 'metadata', None)
1703
1704
1705def read_video(path_or_url: _Path, **kwargs: Any) -> _VideoArray:
1706  """Returns an array containing all images read from a compressed video file.
1707
1708  >>> video = read_video('/tmp/river.mp4')
1709  >>> print(f'The framerate is {video.metadata.fps} frames/s.')
1710  >>> show_video(video)
1711
1712  >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
1713  >>> show_video(read_video(url))
1714
1715  Args:
1716    path_or_url: Input video file.
1717    **kwargs: Additional parameters for `VideoReader`.
1718
1719  Returns:
1720    A 4D `numpy` array with dimensions (frame, height, width, channel), or a 3D
1721    array if `output_format` is specified as 'gray'.  The returned array has an
1722    attribute `metadata` containing `VideoMetadata` information.  This enables
1723    `show_video` to retrieve the framerate in `metadata.fps`.  Note that the
1724    metadata attribute is lost in most subsequent `numpy` operations.
1725  """
1726  with VideoReader(path_or_url, **kwargs) as reader:
1727    return _VideoArray(np.array(tuple(reader)), metadata=reader.metadata)
1728
1729
1730def write_video(path: _Path, images: Iterable[_NDArray], **kwargs: Any) -> None:
1731  """Writes images to a compressed video file.
1732
1733  >>> video = moving_circle((480, 640), num_images=60)
1734  >>> write_video('/tmp/v.mp4', video, fps=60, qp=18)
1735  >>> show_video(read_video('/tmp/v.mp4'))
1736
1737  Args:
1738    path: Output video file.
1739    images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D
1740      arrays.
1741    **kwargs: Additional parameters for `VideoWriter`.
1742  """
1743  first_image, images = _peek_first(images)
1744  shape: tuple[int, int] = first_image.shape[:2]  # type: ignore[assignment]
1745  dtype = first_image.dtype
1746  if dtype == bool:
1747    dtype = np.dtype(np.uint8)
1748  elif np.issubdtype(dtype, np.floating):
1749    dtype = np.dtype(np.uint16)
1750  kwargs = {'metadata': getattr(images, 'metadata', None), **kwargs}
1751  with VideoWriter(path, shape=shape, dtype=dtype, **kwargs) as writer:
1752    for image in images:
1753      writer.add_image(image)
1754
1755
1756def compress_video(
1757    images: Iterable[_NDArray], *, codec: str = 'h264', **kwargs: Any
1758) -> bytes:
1759  """Returns a buffer containing a compressed video.
1760
1761  The video container is 'mp4' except when `codec` is 'gif'.
1762
1763  >>> video = read_video('/tmp/river.mp4')
1764  >>> data = compress_video(video, bps=10_000_000)
1765  >>> print(len(data))
1766
1767  >>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10)
1768
1769  Args:
1770    images: Iterable over video frames.
1771    codec: Compression algorithm as defined by `ffmpeg -codecs` (e.g., 'h264',
1772      'hevc', 'vp9', or 'gif').
1773    **kwargs: Additional parameters for `VideoWriter`.
1774
1775  Returns:
1776    A bytes buffer containing the compressed video.
1777  """
1778  suffix = _filename_suffix_from_codec(codec)
1779  with tempfile.TemporaryDirectory() as directory_name:
1780    tmp_path = pathlib.Path(directory_name) / f'file{suffix}'
1781    write_video(tmp_path, images, codec=codec, **kwargs)
1782    return tmp_path.read_bytes()
1783
1784
1785def decompress_video(data: bytes, **kwargs: Any) -> _NDArray:
1786  """Returns video images from an MP4-compressed data buffer."""
1787  with tempfile.TemporaryDirectory() as directory_name:
1788    tmp_path = pathlib.Path(directory_name) / 'file.mp4'
1789    tmp_path.write_bytes(data)
1790    return read_video(tmp_path, **kwargs)
1791
1792
1793def html_from_compressed_video(
1794    data: bytes,
1795    width: int,
1796    height: int,
1797    *,
1798    title: str | None = None,
1799    border: bool | str = False,
1800    loop: bool = True,
1801    autoplay: bool = True,
1802) -> str:
1803  """Returns an HTML string with a video tag containing H264-encoded data.
1804
1805  Args:
1806    data: MP4-compressed video bytes.
1807    width: Width of HTML video in pixels.
1808    height: Height of HTML video in pixels.
1809    title: Optional text shown centered above the video.
1810    border: If `bool`, whether to place a black boundary around the image, or if
1811      `str`, the boundary CSS style.
1812    loop: If True, the playback repeats forever.
1813    autoplay: If True, video playback starts without having to click.
1814  """
1815  b64 = base64.b64encode(data).decode('utf-8')
1816  if isinstance(border, str):
1817    border = f'{border}; '
1818  elif border:
1819    border = 'border:1px solid black; '
1820  else:
1821    border = ''
1822  options = (
1823      f'controls width="{width}" height="{height}"'
1824      f' style="{border}object-fit:cover;"'
1825      f'{" loop" if loop else ""}'
1826      f'{" autoplay muted" if autoplay else ""}'
1827  )
1828  s = f"""<video {options}>
1829      <source src="data:video/mp4;base64,{b64}" type="video/mp4"/>
1830      This browser does not support the video tag.
1831      </video>"""
1832  if title is not None:
1833    s = f"""<div style="display:flex; align-items:left;">
1834      <div style="display:flex; flex-direction:column; align-items:center;">
1835      <div>{title}</div><div>{s}</div></div></div>"""
1836  return s
1837
1838
1839def show_video(
1840    images: Iterable[_NDArray], *, title: str | None = None, **kwargs: Any
1841) -> str | None:
1842  """Displays a video in the IPython notebook and optionally saves it to a file.
1843
1844  See `show_videos`.
1845
1846  >>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4')
1847  >>> show_video(video, title='River video')
1848
1849  >>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True)
1850
1851  >>> show_video(read_video('/tmp/river.mp4'))
1852
1853  Args:
1854    images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D
1855      arrays).
1856    title: Optional text shown centered above the video.
1857    **kwargs: See `show_videos`.
1858
1859  Returns:
1860    html string if `return_html` is `True`.
1861  """
1862  return show_videos([images], [title], **kwargs)
1863
1864
1865def show_videos(
1866    videos: Iterable[Iterable[_NDArray]] | Mapping[str, Iterable[_NDArray]],
1867    titles: Iterable[str | None] | None = None,
1868    *,
1869    width: int | None = None,
1870    height: int | None = None,
1871    downsample: bool = True,
1872    columns: int | None = None,
1873    fps: float | None = None,
1874    bps: int | None = None,
1875    qp: int | None = None,
1876    codec: str = 'h264',
1877    ylabel: str = '',
1878    html_class: str = 'show_videos',
1879    return_html: bool = False,
1880    **kwargs: Any,
1881) -> str | None:
1882  """Displays a row of videos in the IPython notebook.
1883
1884  Creates HTML with `<video>` tags containing embedded H264-encoded bytestrings.
1885  If `codec` is set to 'gif', we instead use `<img>` tags containing embedded
1886  GIF-encoded bytestrings.  Note that the resulting GIF animations skip frames
1887  when the `fps` period is not a multiple of 10 ms units (GIF frame delay
1888  units).  Encoding at `fps` = 20.0, 25.0, or 50.0 works fine.
1889
1890  If a directory has been specified using `set_show_save_dir`, also saves each
1891  titled video to a file in that directory based on its title.
1892
1893  Args:
1894    videos: Iterable of videos, or dictionary of `{title: video}`.  Each video
1895      must be an iterable of images.  If a video object has a `metadata`
1896      (`VideoMetadata`) attribute, its `fps` field provides a default framerate.
1897    titles: Optional strings shown above the corresponding videos.
1898    width: Optional, overrides displayed width (in pixels).
1899    height: Optional, overrides displayed height (in pixels).
1900    downsample: If True, each video whose width or height is greater than the
1901      specified `width` or `height` is resampled to the display resolution. This
1902      improves antialiasing and reduces the size of the notebook.
1903    columns: Optional, maximum number of videos per row.
1904    fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF).
1905    bps: Bits-per-second bitrate (default None).
1906    qp: Quantization parameter for video compression quality (default None).
1907    codec: Compression algorithm; must be either 'h264' or 'gif'.
1908    ylabel: Text (rotated by 90 degrees) shown on the left of each row.
1909    html_class: CSS class name used in definition of HTML element.
1910    return_html: If `True` return the raw HTML `str` instead of displaying.
1911    **kwargs: Additional parameters (`border`, `loop`, `autoplay`) for
1912      `html_from_compressed_video`.
1913
1914  Returns:
1915    html string if `return_html` is `True`.
1916  """
1917  if isinstance(videos, Mapping):
1918    if titles is not None:
1919      raise ValueError(
1920          'Cannot have both a video dictionary and a titles parameter.'
1921      )
1922    list_titles = list(videos.keys())
1923    list_videos = list(videos.values())
1924  else:
1925    list_videos = list(cast('Iterable[_NDArray]', videos))
1926    list_titles = [None] * len(list_videos) if titles is None else list(titles)
1927    if len(list_videos) != len(list_titles):
1928      raise ValueError(
1929          'Number of videos does not match number of titles'
1930          f' ({len(list_videos)} vs {len(list_titles)}).'
1931      )
1932  if codec not in {'h264', 'gif'}:
1933    raise ValueError(f'Codec {codec} is neither h264 or gif.')
1934
1935  html_strings = []
1936  for video, title in zip(list_videos, list_titles):
1937    metadata: VideoMetadata | None = getattr(video, 'metadata', None)
1938    first_image, video = _peek_first(video)
1939    w, h = _get_width_height(width, height, first_image.shape[:2])
1940    if downsample and (w < first_image.shape[1] or h < first_image.shape[0]):
1941      # Not resize_video() because each image may have different depth and type.
1942      video = [resize_image(image, (h, w)) for image in video]
1943      first_image = video[0]
1944    data = compress_video(
1945        video, metadata=metadata, fps=fps, bps=bps, qp=qp, codec=codec
1946    )
1947    if title is not None and _config.show_save_dir:
1948      suffix = _filename_suffix_from_codec(codec)
1949      path = pathlib.Path(_config.show_save_dir) / f'{title}{suffix}'
1950      with _open(path, mode='wb') as f:
1951        f.write(data)
1952    if codec == 'gif':
1953      pixelated = h > first_image.shape[0] or w > first_image.shape[1]
1954      html_string = html_from_compressed_image(
1955          data, w, h, title=title, fmt='gif', pixelated=pixelated, **kwargs
1956      )
1957    else:
1958      html_string = html_from_compressed_video(
1959          data, w, h, title=title, **kwargs
1960      )
1961    html_strings.append(html_string)
1962
1963  # Create single-row tables each with no more than 'columns' elements.
1964  table_strings = []
1965  for row_html_strings in _chunked(html_strings, columns):
1966    td = '<td style="padding:1px;">'
1967    s = ''.join(f'{td}{e}</td>' for e in row_html_strings)
1968    if ylabel:
1969      style = 'writing-mode:vertical-lr; transform:rotate(180deg);'
1970      s = f'{td}<span style="{style}">{ylabel}</span></td>' + s
1971    table_strings.append(
1972        f'<table class="{html_class}"'
1973        f' style="border-spacing:0px;"><tr>{s}</tr></table>'
1974    )
1975  s = ''.join(table_strings)
1976  if return_html:
1977    return s
1978  _display_html(s)
1979  return None
1980
1981
1982# Local Variables:
1983# fill-column: 80
1984# End:
def show_image( image: ArrayLike, *, title: str | None = None, **kwargs: Any) -> str | None:
974def show_image(
975    image: _ArrayLike, *, title: str | None = None, **kwargs: Any
976) -> str | None:
977  """Displays an image in the notebook and optionally saves it to a file.
978
979  See `show_images`.
980
981  >>> show_image(np.random.rand(100, 100))
982  >>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8'))
983  >>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100)
984  >>> show_image(read_image('/tmp/image.png'))
985  >>> url = 'https://github.com/hhoppe/data/raw/main/image.png'
986  >>> show_image(read_image(url))
987
988  Args:
989    image: 2D array-like, or 3D array-like with 1, 3, or 4 channels.
990    title: Optional text shown centered above the image.
991    **kwargs: See `show_images`.
992
993  Returns:
994    html string if `return_html` is `True`.
995  """
996  return show_images([np.asarray(image)], [title], **kwargs)

Displays an image in the notebook and optionally saves it to a file.

See show_images.

>>> show_image(np.random.rand(100, 100))
>>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8'))
>>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100)
>>> show_image(read_image('/tmp/image.png'))
>>> url = 'https://github.com/hhoppe/data/raw/main/image.png'
>>> show_image(read_image(url))
Arguments:
  • image: 2D array-like, or 3D array-like with 1, 3, or 4 channels.
  • title: Optional text shown centered above the image.
  • **kwargs: See show_images.
Returns:

html string if return_html is True.

def show_images( images: Iterable[ArrayLike] | Mapping[str, ArrayLike], titles: Iterable[str | None] | None = None, *, width: int | None = None, height: int | None = None, downsample: bool = True, columns: int | None = None, vmin: float | None = None, vmax: float | None = None, cmap: str | Callable[[ArrayLike], np.ndarray] = 'gray', border: bool | str = False, ylabel: str = '', html_class: str = 'show_images', pixelated: bool | None = None, return_html: bool = False) -> str | None:
 999def show_images(
1000    images: Iterable[_ArrayLike] | Mapping[str, _ArrayLike],
1001    titles: Iterable[str | None] | None = None,
1002    *,
1003    width: int | None = None,
1004    height: int | None = None,
1005    downsample: bool = True,
1006    columns: int | None = None,
1007    vmin: float | None = None,
1008    vmax: float | None = None,
1009    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
1010    border: bool | str = False,
1011    ylabel: str = '',
1012    html_class: str = 'show_images',
1013    pixelated: bool | None = None,
1014    return_html: bool = False,
1015) -> str | None:
1016  """Displays a row of images in the IPython/Jupyter notebook.
1017
1018  If a directory has been specified using `set_show_save_dir`, also saves each
1019  titled image to a file in that directory based on its title.
1020
1021  >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
1022  >>> show_images([image1, image2])
1023  >>> show_images({'random image': image1, 'color ramp': image2}, height=128)
1024  >>> show_images([image1, image2] * 5, columns=4, border=True)
1025
1026  Args:
1027    images: Iterable of images, or dictionary of `{title: image}`.  Each image
1028      must be either a 2D array or a 3D array with 1, 3, or 4 channels.
1029    titles: Optional strings shown above the corresponding images.
1030    width: Optional, overrides displayed width (in pixels).
1031    height: Optional, overrides displayed height (in pixels).
1032    downsample: If True, each image whose width or height is greater than the
1033      specified `width` or `height` is resampled to the display resolution. This
1034      improves antialiasing and reduces the size of the notebook.
1035    columns: Optional, maximum number of images per row.
1036    vmin: For single-channel image, explicit min value for display.
1037    vmax: For single-channel image, explicit max value for display.
1038    cmap: For single-channel image, `pyplot` color map or callable to map 1D to
1039      3D color.
1040    border: If `bool`, whether to place a black boundary around the image, or if
1041      `str`, the boundary CSS style.
1042    ylabel: Text (rotated by 90 degrees) shown on the left of each row.
1043    html_class: CSS class name used in definition of HTML element.
1044    pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if
1045      False, sets 'image-rendering: auto'; if None, uses pixelated rendering
1046      only on images for which `width` or `height` introduces magnification.
1047    return_html: If `True` return the raw HTML `str` instead of displaying.
1048
1049  Returns:
1050    html string if `return_html` is `True`.
1051  """
1052  if isinstance(images, Mapping):
1053    if titles is not None:
1054      raise ValueError('Cannot have images dictionary and titles parameter.')
1055    list_titles, list_images = list(images.keys()), list(images.values())
1056  else:
1057    list_images = list(images)
1058    list_titles = [None] * len(list_images) if titles is None else list(titles)
1059    if len(list_images) != len(list_titles):
1060      raise ValueError(
1061          'Number of images does not match number of titles'
1062          f' ({len(list_images)} vs {len(list_titles)}).'
1063      )
1064
1065  list_images = [
1066      _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
1067      for image in list_images
1068  ]
1069
1070  def maybe_downsample(image: _NDArray) -> _NDArray:
1071    shape: tuple[int, int] = image.shape[:2]  # type: ignore[assignment]
1072    w, h = _get_width_height(width, height, shape)
1073    if w < shape[1] or h < shape[0]:
1074      image = resize_image(image, (h, w))
1075    return image
1076
1077  if downsample:
1078    list_images = [maybe_downsample(image) for image in list_images]
1079  png_datas = [compress_image(to_uint8(image)) for image in list_images]
1080
1081  for title, png_data in zip(list_titles, png_datas):
1082    if title is not None and _config.show_save_dir:
1083      path = pathlib.Path(_config.show_save_dir) / f'{title}.png'
1084      with _open(path, mode='wb') as f:
1085        f.write(png_data)
1086
1087  def html_from_compressed_images() -> str:
1088    html_strings = []
1089    for image, title, png_data in zip(list_images, list_titles, png_datas):
1090      w, h = _get_width_height(width, height, image.shape[:2])
1091      magnified = h > image.shape[0] or w > image.shape[1]
1092      pixelated2 = pixelated if pixelated is not None else magnified
1093      html_strings.append(
1094          html_from_compressed_image(
1095              png_data, w, h, title=title, border=border, pixelated=pixelated2
1096          )
1097      )
1098    # Create single-row tables each with no more than 'columns' elements.
1099    table_strings = []
1100    for row_html_strings in _chunked(html_strings, columns):
1101      td = '<td style="padding:1px;">'
1102      s = ''.join(f'{td}{e}</td>' for e in row_html_strings)
1103      if ylabel:
1104        style = 'writing-mode:vertical-lr; transform:rotate(180deg);'
1105        s = f'{td}<span style="{style}">{ylabel}</span></td>' + s
1106      table_strings.append(
1107          f'<table class="{html_class}"'
1108          f' style="border-spacing:0px;"><tr>{s}</tr></table>'
1109      )
1110    return ''.join(table_strings)
1111
1112  s = html_from_compressed_images()
1113  while len(s) > _IPYTHON_HTML_SIZE_LIMIT * 0.5:
1114    list_images = [image[::2, ::2] for image in list_images]
1115    png_datas = [compress_image(to_uint8(image)) for image in list_images]
1116    s = html_from_compressed_images()
1117  if return_html:
1118    return s
1119  _display_html(s)
1120  return None

Displays a row of images in the IPython/Jupyter notebook.

If a directory has been specified using set_show_save_dir, also saves each titled image to a file in that directory based on its title.

>>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
>>> show_images([image1, image2])
>>> show_images({'random image': image1, 'color ramp': image2}, height=128)
>>> show_images([image1, image2] * 5, columns=4, border=True)
Arguments:
  • images: Iterable of images, or dictionary of {title: image}. Each image must be either a 2D array or a 3D array with 1, 3, or 4 channels.
  • titles: Optional strings shown above the corresponding images.
  • width: Optional, overrides displayed width (in pixels).
  • height: Optional, overrides displayed height (in pixels).
  • downsample: If True, each image whose width or height is greater than the specified width or height is resampled to the display resolution. This improves antialiasing and reduces the size of the notebook.
  • columns: Optional, maximum number of images per row.
  • vmin: For single-channel image, explicit min value for display.
  • vmax: For single-channel image, explicit max value for display.
  • cmap: For single-channel image, pyplot color map or callable to map 1D to 3D color.
  • border: If bool, whether to place a black boundary around the image, or if str, the boundary CSS style.
  • ylabel: Text (rotated by 90 degrees) shown on the left of each row.
  • html_class: CSS class name used in definition of HTML element.
  • pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if False, sets 'image-rendering: auto'; if None, uses pixelated rendering only on images for which width or height introduces magnification.
  • return_html: If True return the raw HTML str instead of displaying.
Returns:

html string if return_html is True.

def compare_images( images: Iterable[ArrayLike], *, vmin: float | None = None, vmax: float | None = None, cmap: str | Callable[[ArrayLike], np.ndarray] = 'gray') -> None:
1123def compare_images(
1124    images: Iterable[_ArrayLike],
1125    *,
1126    vmin: float | None = None,
1127    vmax: float | None = None,
1128    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
1129) -> None:
1130  """Compare two images using an interactive slider.
1131
1132  Displays an HTML slider component to interactively swipe between two images.
1133  The slider functionality requires that the web browser have Internet access.
1134  See additional info in `https://github.com/sneas/img-comparison-slider`.
1135
1136  >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
1137  >>> compare_images([image1, image2])
1138
1139  Args:
1140    images: Iterable of images.  Each image must be either a 2D array or a 3D
1141      array with 1, 3, or 4 channels.  There must be exactly two images.
1142    vmin: For single-channel image, explicit min value for display.
1143    vmax: For single-channel image, explicit max value for display.
1144    cmap: For single-channel image, `pyplot` color map or callable to map 1D to
1145      3D color.
1146  """
1147  list_images = [
1148      _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
1149      for image in images
1150  ]
1151  if len(list_images) != 2:
1152    raise ValueError('The number of images must be 2.')
1153  png_datas = [compress_image(to_uint8(image)) for image in list_images]
1154  b64_1, b64_2 = [
1155      base64.b64encode(png_data).decode('utf-8') for png_data in png_datas
1156  ]
1157  s = _IMAGE_COMPARISON_HTML.replace('{b64_1}', b64_1).replace('{b64_2}', b64_2)
1158  _display_html(s)

Compare two images using an interactive slider.

Displays an HTML slider component to interactively swipe between two images. The slider functionality requires that the web browser have Internet access. See additional info in https://github.com/sneas/img-comparison-slider.

>>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
>>> compare_images([image1, image2])
Arguments:
  • images: Iterable of images. Each image must be either a 2D array or a 3D array with 1, 3, or 4 channels. There must be exactly two images.
  • vmin: For single-channel image, explicit min value for display.
  • vmax: For single-channel image, explicit max value for display.
  • cmap: For single-channel image, pyplot color map or callable to map 1D to 3D color.
def show_video( images: Iterable[np.ndarray], *, title: str | None = None, **kwargs: Any) -> str | None:
1840def show_video(
1841    images: Iterable[_NDArray], *, title: str | None = None, **kwargs: Any
1842) -> str | None:
1843  """Displays a video in the IPython notebook and optionally saves it to a file.
1844
1845  See `show_videos`.
1846
1847  >>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4')
1848  >>> show_video(video, title='River video')
1849
1850  >>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True)
1851
1852  >>> show_video(read_video('/tmp/river.mp4'))
1853
1854  Args:
1855    images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D
1856      arrays).
1857    title: Optional text shown centered above the video.
1858    **kwargs: See `show_videos`.
1859
1860  Returns:
1861    html string if `return_html` is `True`.
1862  """
1863  return show_videos([images], [title], **kwargs)

Displays a video in the IPython notebook and optionally saves it to a file.

See show_videos.

>>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4')
>>> show_video(video, title='River video')
>>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True)
>>> show_video(read_video('/tmp/river.mp4'))
Arguments:
  • images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D arrays).
  • title: Optional text shown centered above the video.
  • **kwargs: See show_videos.
Returns:

html string if return_html is True.

def show_videos( videos: Iterable[Iterable[np.ndarray]] | Mapping[str, Iterable[np.ndarray]], titles: Iterable[str | None] | None = None, *, width: int | None = None, height: int | None = None, downsample: bool = True, columns: int | None = None, fps: float | None = None, bps: int | None = None, qp: int | None = None, codec: str = 'h264', ylabel: str = '', html_class: str = 'show_videos', return_html: bool = False, **kwargs: Any) -> str | None:
1866def show_videos(
1867    videos: Iterable[Iterable[_NDArray]] | Mapping[str, Iterable[_NDArray]],
1868    titles: Iterable[str | None] | None = None,
1869    *,
1870    width: int | None = None,
1871    height: int | None = None,
1872    downsample: bool = True,
1873    columns: int | None = None,
1874    fps: float | None = None,
1875    bps: int | None = None,
1876    qp: int | None = None,
1877    codec: str = 'h264',
1878    ylabel: str = '',
1879    html_class: str = 'show_videos',
1880    return_html: bool = False,
1881    **kwargs: Any,
1882) -> str | None:
1883  """Displays a row of videos in the IPython notebook.
1884
1885  Creates HTML with `<video>` tags containing embedded H264-encoded bytestrings.
1886  If `codec` is set to 'gif', we instead use `<img>` tags containing embedded
1887  GIF-encoded bytestrings.  Note that the resulting GIF animations skip frames
1888  when the `fps` period is not a multiple of 10 ms units (GIF frame delay
1889  units).  Encoding at `fps` = 20.0, 25.0, or 50.0 works fine.
1890
1891  If a directory has been specified using `set_show_save_dir`, also saves each
1892  titled video to a file in that directory based on its title.
1893
1894  Args:
1895    videos: Iterable of videos, or dictionary of `{title: video}`.  Each video
1896      must be an iterable of images.  If a video object has a `metadata`
1897      (`VideoMetadata`) attribute, its `fps` field provides a default framerate.
1898    titles: Optional strings shown above the corresponding videos.
1899    width: Optional, overrides displayed width (in pixels).
1900    height: Optional, overrides displayed height (in pixels).
1901    downsample: If True, each video whose width or height is greater than the
1902      specified `width` or `height` is resampled to the display resolution. This
1903      improves antialiasing and reduces the size of the notebook.
1904    columns: Optional, maximum number of videos per row.
1905    fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF).
1906    bps: Bits-per-second bitrate (default None).
1907    qp: Quantization parameter for video compression quality (default None).
1908    codec: Compression algorithm; must be either 'h264' or 'gif'.
1909    ylabel: Text (rotated by 90 degrees) shown on the left of each row.
1910    html_class: CSS class name used in definition of HTML element.
1911    return_html: If `True` return the raw HTML `str` instead of displaying.
1912    **kwargs: Additional parameters (`border`, `loop`, `autoplay`) for
1913      `html_from_compressed_video`.
1914
1915  Returns:
1916    html string if `return_html` is `True`.
1917  """
1918  if isinstance(videos, Mapping):
1919    if titles is not None:
1920      raise ValueError(
1921          'Cannot have both a video dictionary and a titles parameter.'
1922      )
1923    list_titles = list(videos.keys())
1924    list_videos = list(videos.values())
1925  else:
1926    list_videos = list(cast('Iterable[_NDArray]', videos))
1927    list_titles = [None] * len(list_videos) if titles is None else list(titles)
1928    if len(list_videos) != len(list_titles):
1929      raise ValueError(
1930          'Number of videos does not match number of titles'
1931          f' ({len(list_videos)} vs {len(list_titles)}).'
1932      )
1933  if codec not in {'h264', 'gif'}:
1934    raise ValueError(f'Codec {codec} is neither h264 or gif.')
1935
1936  html_strings = []
1937  for video, title in zip(list_videos, list_titles):
1938    metadata: VideoMetadata | None = getattr(video, 'metadata', None)
1939    first_image, video = _peek_first(video)
1940    w, h = _get_width_height(width, height, first_image.shape[:2])
1941    if downsample and (w < first_image.shape[1] or h < first_image.shape[0]):
1942      # Not resize_video() because each image may have different depth and type.
1943      video = [resize_image(image, (h, w)) for image in video]
1944      first_image = video[0]
1945    data = compress_video(
1946        video, metadata=metadata, fps=fps, bps=bps, qp=qp, codec=codec
1947    )
1948    if title is not None and _config.show_save_dir:
1949      suffix = _filename_suffix_from_codec(codec)
1950      path = pathlib.Path(_config.show_save_dir) / f'{title}{suffix}'
1951      with _open(path, mode='wb') as f:
1952        f.write(data)
1953    if codec == 'gif':
1954      pixelated = h > first_image.shape[0] or w > first_image.shape[1]
1955      html_string = html_from_compressed_image(
1956          data, w, h, title=title, fmt='gif', pixelated=pixelated, **kwargs
1957      )
1958    else:
1959      html_string = html_from_compressed_video(
1960          data, w, h, title=title, **kwargs
1961      )
1962    html_strings.append(html_string)
1963
1964  # Create single-row tables each with no more than 'columns' elements.
1965  table_strings = []
1966  for row_html_strings in _chunked(html_strings, columns):
1967    td = '<td style="padding:1px;">'
1968    s = ''.join(f'{td}{e}</td>' for e in row_html_strings)
1969    if ylabel:
1970      style = 'writing-mode:vertical-lr; transform:rotate(180deg);'
1971      s = f'{td}<span style="{style}">{ylabel}</span></td>' + s
1972    table_strings.append(
1973        f'<table class="{html_class}"'
1974        f' style="border-spacing:0px;"><tr>{s}</tr></table>'
1975    )
1976  s = ''.join(table_strings)
1977  if return_html:
1978    return s
1979  _display_html(s)
1980  return None

Displays a row of videos in the IPython notebook.

Creates HTML with <video> tags containing embedded H264-encoded bytestrings. If codec is set to 'gif', we instead use <img> tags containing embedded GIF-encoded bytestrings. Note that the resulting GIF animations skip frames when the fps period is not a multiple of 10 ms units (GIF frame delay units). Encoding at fps = 20.0, 25.0, or 50.0 works fine.

If a directory has been specified using set_show_save_dir, also saves each titled video to a file in that directory based on its title.

Arguments:
  • videos: Iterable of videos, or dictionary of {title: video}. Each video must be an iterable of images. If a video object has a metadata (VideoMetadata) attribute, its fps field provides a default framerate.
  • titles: Optional strings shown above the corresponding videos.
  • width: Optional, overrides displayed width (in pixels).
  • height: Optional, overrides displayed height (in pixels).
  • downsample: If True, each video whose width or height is greater than the specified width or height is resampled to the display resolution. This improves antialiasing and reduces the size of the notebook.
  • columns: Optional, maximum number of videos per row.
  • fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF).
  • bps: Bits-per-second bitrate (default None).
  • qp: Quantization parameter for video compression quality (default None).
  • codec: Compression algorithm; must be either 'h264' or 'gif'.
  • ylabel: Text (rotated by 90 degrees) shown on the left of each row.
  • html_class: CSS class name used in definition of HTML element.
  • return_html: If True return the raw HTML str instead of displaying.
  • **kwargs: Additional parameters (border, loop, autoplay) for html_from_compressed_video.
Returns:

html string if return_html is True.

def read_image( path_or_url: Union[str, os.PathLike[str]], *, apply_exif_transpose: bool = True, dtype: DTypeLike = None) -> np.ndarray:
766def read_image(
767    path_or_url: _Path,
768    *,
769    apply_exif_transpose: bool = True,
770    dtype: _DTypeLike = None,
771) -> _NDArray:
772  """Returns an image read from a file path or URL.
773
774  Decoding is performed using `PIL`, which supports `uint8` images with 1, 3,
775  or 4 channels and `uint16` images with a single channel.
776
777  Args:
778    path_or_url: Path of input file.
779    apply_exif_transpose: If True, rotate image according to EXIF orientation.
780    dtype: Data type of the returned array.  If None, `np.uint8` or `np.uint16`
781      is inferred automatically.
782  """
783  data = read_contents(path_or_url)
784  return decompress_image(data, dtype, apply_exif_transpose)

Returns an image read from a file path or URL.

Decoding is performed using PIL, which supports uint8 images with 1, 3, or 4 channels and uint16 images with a single channel.

Arguments:
  • path_or_url: Path of input file.
  • apply_exif_transpose: If True, rotate image according to EXIF orientation.
  • dtype: Data type of the returned array. If None, np.uint8 or np.uint16 is inferred automatically.
def write_image( path: Union[str, os.PathLike[str]], image: ArrayLike, fmt: str = 'png', **kwargs: Any) -> None:
787def write_image(
788    path: _Path, image: _ArrayLike, fmt: str = 'png', **kwargs: Any
789) -> None:
790  """Writes an image to a file.
791
792  Encoding is performed using `PIL`, which supports `uint8` images with 1, 3,
793  or 4 channels and `uint16` images with a single channel.
794
795  File format is explicitly provided by `fmt` and not inferred by `path`.
796
797  Args:
798    path: Path of output file.
799    image: Array-like object.  If its type is float, it is converted to np.uint8
800      using `to_uint8` (thus clamping to the input to the range [0.0, 1.0]).
801      Otherwise it must be np.uint8 or np.uint16.
802    fmt: Desired compression encoding, e.g. 'png'.
803    **kwargs: Additional parameters for `PIL.Image.save()`.
804  """
805  image = _as_valid_media_array(image)
806  if np.issubdtype(image.dtype, np.floating):
807    image = to_uint8(image)
808  with _open(path, 'wb') as f:
809    _pil_image(image).save(f, format=fmt, **kwargs)

Writes an image to a file.

Encoding is performed using PIL, which supports uint8 images with 1, 3, or 4 channels and uint16 images with a single channel.

File format is explicitly provided by fmt and not inferred by path.

Arguments:
  • path: Path of output file.
  • image: Array-like object. If its type is float, it is converted to np.uint8 using to_uint8 (thus clamping to the input to the range [0.0, 1.0]). Otherwise it must be np.uint8 or np.uint16.
  • fmt: Desired compression encoding, e.g. 'png'.
  • **kwargs: Additional parameters for PIL.Image.save().
def read_video( path_or_url: Union[str, os.PathLike[str]], **kwargs: Any) -> mediapy._VideoArray:
1706def read_video(path_or_url: _Path, **kwargs: Any) -> _VideoArray:
1707  """Returns an array containing all images read from a compressed video file.
1708
1709  >>> video = read_video('/tmp/river.mp4')
1710  >>> print(f'The framerate is {video.metadata.fps} frames/s.')
1711  >>> show_video(video)
1712
1713  >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
1714  >>> show_video(read_video(url))
1715
1716  Args:
1717    path_or_url: Input video file.
1718    **kwargs: Additional parameters for `VideoReader`.
1719
1720  Returns:
1721    A 4D `numpy` array with dimensions (frame, height, width, channel), or a 3D
1722    array if `output_format` is specified as 'gray'.  The returned array has an
1723    attribute `metadata` containing `VideoMetadata` information.  This enables
1724    `show_video` to retrieve the framerate in `metadata.fps`.  Note that the
1725    metadata attribute is lost in most subsequent `numpy` operations.
1726  """
1727  with VideoReader(path_or_url, **kwargs) as reader:
1728    return _VideoArray(np.array(tuple(reader)), metadata=reader.metadata)

Returns an array containing all images read from a compressed video file.

>>> video = read_video('/tmp/river.mp4')
>>> print(f'The framerate is {video.metadata.fps} frames/s.')
>>> show_video(video)
>>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
>>> show_video(read_video(url))
Arguments:
  • path_or_url: Input video file.
  • **kwargs: Additional parameters for VideoReader.
Returns:

A 4D numpy array with dimensions (frame, height, width, channel), or a 3D array if output_format is specified as 'gray'. The returned array has an attribute metadata containing VideoMetadata information. This enables show_video to retrieve the framerate in metadata.fps. Note that the metadata attribute is lost in most subsequent numpy operations.

def write_video( path: Union[str, os.PathLike[str]], images: Iterable[np.ndarray], **kwargs: Any) -> None:
1731def write_video(path: _Path, images: Iterable[_NDArray], **kwargs: Any) -> None:
1732  """Writes images to a compressed video file.
1733
1734  >>> video = moving_circle((480, 640), num_images=60)
1735  >>> write_video('/tmp/v.mp4', video, fps=60, qp=18)
1736  >>> show_video(read_video('/tmp/v.mp4'))
1737
1738  Args:
1739    path: Output video file.
1740    images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D
1741      arrays.
1742    **kwargs: Additional parameters for `VideoWriter`.
1743  """
1744  first_image, images = _peek_first(images)
1745  shape: tuple[int, int] = first_image.shape[:2]  # type: ignore[assignment]
1746  dtype = first_image.dtype
1747  if dtype == bool:
1748    dtype = np.dtype(np.uint8)
1749  elif np.issubdtype(dtype, np.floating):
1750    dtype = np.dtype(np.uint16)
1751  kwargs = {'metadata': getattr(images, 'metadata', None), **kwargs}
1752  with VideoWriter(path, shape=shape, dtype=dtype, **kwargs) as writer:
1753    for image in images:
1754      writer.add_image(image)

Writes images to a compressed video file.

>>> video = moving_circle((480, 640), num_images=60)
>>> write_video('/tmp/v.mp4', video, fps=60, qp=18)
>>> show_video(read_video('/tmp/v.mp4'))
Arguments:
  • path: Output video file.
  • images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D arrays.
  • **kwargs: Additional parameters for VideoWriter.
class VideoReader(_VideoIO):
1281class VideoReader(_VideoIO):
1282  """Context to read a compressed video as an iterable over its images.
1283
1284  >>> with VideoReader('/tmp/river.mp4') as reader:
1285  ...   print(f'Video has {reader.num_images} images with shape={reader.shape},'
1286  ...         f' at {reader.fps} frames/sec and {reader.bps} bits/sec.')
1287  ...   for image in reader:
1288  ...     print(image.shape)
1289
1290  >>> with VideoReader('/tmp/river.mp4') as reader:
1291  ...   video = np.array(tuple(reader))
1292
1293  >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
1294  >>> with VideoReader(url) as reader:
1295  ...   show_video(reader)
1296
1297  Attributes:
1298    path_or_url: Location of input video.
1299    output_format: Format of output images (default 'rgb').  If 'rgb', each
1300      image has shape=(height, width, 3) with R, G, B values.  If 'yuv', each
1301      image has shape=(height, width, 3) with Y, U, V values.  If 'gray', each
1302      image has shape=(height, width).
1303    dtype: Data type for output images.  The default is `np.uint8`.  Use of
1304      `np.uint16` allows reading 10-bit or 12-bit data without precision loss.
1305    metadata: Object storing the information retrieved from the video header.
1306      Its attributes are copied as attributes in this class.
1307    num_images: Number of frames that is expected from the video stream.  This
1308      is estimated from the framerate and the duration stored in the video
1309      header, so it might be inexact.
1310    shape: The dimensions (height, width) of each video frame.
1311    fps: The framerate in frames per second.
1312    bps: The estimated bitrate of the video stream in bits per second, retrieved
1313      from the video header.
1314  """
1315
1316  path_or_url: _Path
1317  output_format: str
1318  dtype: _DType
1319  metadata: VideoMetadata
1320  num_images: int
1321  shape: tuple[int, int]
1322  fps: float
1323  bps: int | None
1324  _num_bytes_per_image: int
1325
1326  def __init__(
1327      self,
1328      path_or_url: _Path,
1329      *,
1330      output_format: str = 'rgb',
1331      dtype: _DTypeLike = np.uint8,
1332  ):
1333    if output_format not in {'rgb', 'yuv', 'gray'}:
1334      raise ValueError(
1335          f'Output format {output_format} is not rgb, yuv, or gray.'
1336      )
1337    self.path_or_url = path_or_url
1338    self.output_format = output_format
1339    self.dtype = np.dtype(dtype)
1340    if self.dtype.type not in (np.uint8, np.uint16):
1341      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1342    self._read_via_local_file: Any = None
1343    self._popen: subprocess.Popen[bytes] | None = None
1344    self._proc: subprocess.Popen[bytes] | None = None
1345
1346  def __enter__(self) -> 'VideoReader':
1347    ffmpeg_path = _get_ffmpeg_path()
1348    try:
1349      self._read_via_local_file = _read_via_local_file(self.path_or_url)
1350      # pylint: disable-next=no-member
1351      tmp_name = self._read_via_local_file.__enter__()
1352
1353      self.metadata = _get_video_metadata(tmp_name)
1354      self.num_images, self.shape, self.fps, self.bps = self.metadata
1355      pix_fmt = self._get_pix_fmt(self.dtype, self.output_format)
1356      num_channels = {'rgb': 3, 'yuv': 3, 'gray': 1}[self.output_format]
1357      bytes_per_channel = self.dtype.itemsize
1358      self._num_bytes_per_image = (
1359          math.prod(self.shape) * num_channels * bytes_per_channel
1360      )
1361
1362      command = [
1363          ffmpeg_path,
1364          '-v',
1365          'panic',
1366          '-nostdin',
1367          '-i',
1368          tmp_name,
1369          '-vcodec',
1370          'rawvideo',
1371          '-f',
1372          'image2pipe',
1373          '-pix_fmt',
1374          pix_fmt,
1375          '-vsync',
1376          'vfr',
1377          '-',
1378      ]
1379      self._popen = subprocess.Popen(
1380          command, stdout=subprocess.PIPE, stderr=subprocess.PIPE
1381      )
1382      self._proc = self._popen.__enter__()
1383    except Exception:
1384      self.__exit__(None, None, None)
1385      raise
1386    return self
1387
1388  def __exit__(self, *_: Any) -> None:
1389    self.close()
1390
1391  def read(self) -> _NDArray | None:
1392    """Reads a video image frame (or None if at end of file).
1393
1394    Returns:
1395      A numpy array in the format specified by `output_format`, i.e., a 3D
1396      array with 3 color channels, except for format 'gray' which is 2D.
1397    """
1398    assert self._proc, 'Error: reading from an already closed context.'
1399    stdout = self._proc.stdout
1400    assert stdout is not None
1401    data = stdout.read(self._num_bytes_per_image)
1402    if not data:  # Due to either end-of-file or subprocess error.
1403      self.close()  # Raises exception if subprocess had error.
1404      return None  # To indicate end-of-file.
1405    assert len(data) == self._num_bytes_per_image
1406    image = np.frombuffer(data, dtype=self.dtype)
1407    if self.output_format == 'rgb':
1408      image = image.reshape(*self.shape, 3)
1409    elif self.output_format == 'yuv':  # Convert from planar YUV to pixel YUV.
1410      image = np.moveaxis(image.reshape(3, *self.shape), 0, 2)
1411    elif self.output_format == 'gray':  # Generate 2D rather than 3D ndimage.
1412      image = image.reshape(*self.shape)
1413    else:
1414      raise AssertionError
1415    return image
1416
1417  def __iter__(self) -> Iterator[_NDArray]:
1418    while True:
1419      image = self.read()
1420      if image is None:
1421        return
1422      yield image
1423
1424  def close(self) -> None:
1425    """Terminates video reader.  (Called automatically at end of context.)"""
1426    if self._popen:
1427      self._popen.__exit__(None, None, None)
1428      self._popen = None
1429      self._proc = None
1430    if self._read_via_local_file:
1431      # pylint: disable-next=no-member
1432      self._read_via_local_file.__exit__(None, None, None)
1433      self._read_via_local_file = None

Context to read a compressed video as an iterable over its images.

>>> with VideoReader('/tmp/river.mp4') as reader:
...   print(f'Video has {reader.num_images} images with shape={reader.shape},'
...         f' at {reader.fps} frames/sec and {reader.bps} bits/sec.')
...   for image in reader:
...     print(image.shape)
>>> with VideoReader('/tmp/river.mp4') as reader:
...   video = np.array(tuple(reader))
>>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
>>> with VideoReader(url) as reader:
...   show_video(reader)
Attributes:
  • path_or_url: Location of input video.
  • output_format: Format of output images (default 'rgb'). If 'rgb', each image has shape=(height, width, 3) with R, G, B values. If 'yuv', each image has shape=(height, width, 3) with Y, U, V values. If 'gray', each image has shape=(height, width).
  • dtype: Data type for output images. The default is np.uint8. Use of np.uint16 allows reading 10-bit or 12-bit data without precision loss.
  • metadata: Object storing the information retrieved from the video header. Its attributes are copied as attributes in this class.
  • num_images: Number of frames that is expected from the video stream. This is estimated from the framerate and the duration stored in the video header, so it might be inexact.
  • shape: The dimensions (height, width) of each video frame.
  • fps: The framerate in frames per second.
  • bps: The estimated bitrate of the video stream in bits per second, retrieved from the video header.
VideoReader( path_or_url: Union[str, os.PathLike[str]], *, output_format: str = 'rgb', dtype: DTypeLike = <class 'numpy.uint8'>)
1326  def __init__(
1327      self,
1328      path_or_url: _Path,
1329      *,
1330      output_format: str = 'rgb',
1331      dtype: _DTypeLike = np.uint8,
1332  ):
1333    if output_format not in {'rgb', 'yuv', 'gray'}:
1334      raise ValueError(
1335          f'Output format {output_format} is not rgb, yuv, or gray.'
1336      )
1337    self.path_or_url = path_or_url
1338    self.output_format = output_format
1339    self.dtype = np.dtype(dtype)
1340    if self.dtype.type not in (np.uint8, np.uint16):
1341      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1342    self._read_via_local_file: Any = None
1343    self._popen: subprocess.Popen[bytes] | None = None
1344    self._proc: subprocess.Popen[bytes] | None = None
path_or_url: Union[str, os.PathLike[str]]
output_format: str
dtype: ~_DType
metadata: VideoMetadata
num_images: int
shape: tuple[int, int]
fps: float
bps: int | None
def read(self) -> Optional[np.ndarray]:
1391  def read(self) -> _NDArray | None:
1392    """Reads a video image frame (or None if at end of file).
1393
1394    Returns:
1395      A numpy array in the format specified by `output_format`, i.e., a 3D
1396      array with 3 color channels, except for format 'gray' which is 2D.
1397    """
1398    assert self._proc, 'Error: reading from an already closed context.'
1399    stdout = self._proc.stdout
1400    assert stdout is not None
1401    data = stdout.read(self._num_bytes_per_image)
1402    if not data:  # Due to either end-of-file or subprocess error.
1403      self.close()  # Raises exception if subprocess had error.
1404      return None  # To indicate end-of-file.
1405    assert len(data) == self._num_bytes_per_image
1406    image = np.frombuffer(data, dtype=self.dtype)
1407    if self.output_format == 'rgb':
1408      image = image.reshape(*self.shape, 3)
1409    elif self.output_format == 'yuv':  # Convert from planar YUV to pixel YUV.
1410      image = np.moveaxis(image.reshape(3, *self.shape), 0, 2)
1411    elif self.output_format == 'gray':  # Generate 2D rather than 3D ndimage.
1412      image = image.reshape(*self.shape)
1413    else:
1414      raise AssertionError
1415    return image

Reads a video image frame (or None if at end of file).

Returns:

A numpy array in the format specified by output_format, i.e., a 3D array with 3 color channels, except for format 'gray' which is 2D.

def close(self) -> None:
1424  def close(self) -> None:
1425    """Terminates video reader.  (Called automatically at end of context.)"""
1426    if self._popen:
1427      self._popen.__exit__(None, None, None)
1428      self._popen = None
1429      self._proc = None
1430    if self._read_via_local_file:
1431      # pylint: disable-next=no-member
1432      self._read_via_local_file.__exit__(None, None, None)
1433      self._read_via_local_file = None

Terminates video reader. (Called automatically at end of context.)

class VideoWriter(_VideoIO):
1436class VideoWriter(_VideoIO):
1437  """Context to write a compressed video.
1438
1439  >>> shape = 480, 640
1440  >>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer:
1441  ...   for image in moving_circle(shape, num_images=60):
1442  ...     writer.add_image(image)
1443  >>> show_video(read_video('/tmp/v.mp4'))
1444
1445
1446  Bitrate control may be specified using at most one of: `bps`, `qp`, or `crf`.
1447  If none are specified, `qp` is set to a default value.
1448  See https://slhck.info/video/2017/03/01/rate-control.html
1449
1450  If codec is 'gif', the args `bps`, `qp`, `crf`, and `encoded_format` are
1451  ignored.
1452
1453  Attributes:
1454    path: Output video.  Its suffix (e.g. '.mp4') determines the video container
1455      format.  The suffix must be '.gif' if the codec is 'gif'.
1456    shape: 2D spatial dimensions (height, width) of video image frames.  The
1457      dimensions must be even if 'encoded_format' has subsampled chroma (e.g.,
1458      'yuv420p' or 'yuv420p10le').
1459    codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264',
1460      'hevc', 'vp9', or 'gif').
1461    metadata: Optional VideoMetadata object whose `fps` and `bps` attributes are
1462      used if not specified as explicit parameters.
1463    fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif').
1464    bps: Requested average bits-per-second bitrate (default None).
1465    qp: Quantization parameter for video compression quality (default None).
1466    crf: Constant rate factor for video compression quality (default None).
1467    ffmpeg_args: Additional arguments for `ffmpeg` command, e.g. '-g 30' to
1468      introduce I-frames, or '-bf 0' to omit B-frames.
1469    input_format: Format of input images (default 'rgb').  If 'rgb', each image
1470      has shape=(height, width, 3) or (height, width).  If 'yuv', each image has
1471      shape=(height, width, 3) with Y, U, V values.  If 'gray', each image has
1472      shape=(height, width).
1473    dtype: Expected data type for input images (any float input images are
1474      converted to `dtype`).  The default is `np.uint8`.  Use of `np.uint16` is
1475      necessary when encoding >8 bits/channel.
1476    encoded_format: Pixel format as defined by `ffmpeg -pix_fmts`, e.g.,
1477      'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma),
1478      'yuv420p10le' (10-bit per channel), etc.  The default (None) selects
1479      'yuv420p' if all shape dimensions are even, else 'yuv444p'.
1480  """
1481
1482  def __init__(
1483      self,
1484      path: _Path,
1485      shape: tuple[int, int],
1486      *,
1487      codec: str = 'h264',
1488      metadata: VideoMetadata | None = None,
1489      fps: float | None = None,
1490      bps: int | None = None,
1491      qp: int | None = None,
1492      crf: float | None = None,
1493      ffmpeg_args: str | Sequence[str] = '',
1494      input_format: str = 'rgb',
1495      dtype: _DTypeLike = np.uint8,
1496      encoded_format: str | None = None,
1497  ) -> None:
1498    _check_2d_shape(shape)
1499    if fps is None and metadata:
1500      fps = metadata.fps
1501    if fps is None:
1502      fps = 25.0 if codec == 'gif' else 60.0
1503    if fps <= 0.0:
1504      raise ValueError(f'Frame-per-second value {fps} is invalid.')
1505    if bps is None and metadata:
1506      bps = metadata.bps
1507    bps = int(bps) if bps is not None else None
1508    if bps is not None and bps <= 0:
1509      raise ValueError(f'Bitrate value {bps} is invalid.')
1510    if qp is not None and (not isinstance(qp, int) or qp <= 0):
1511      raise ValueError(
1512          f'Quantization parameter {qp} is not a positive integer.'
1513      )
1514    num_rate_specifications = sum(x is not None for x in (bps, qp, crf))
1515    if num_rate_specifications > 1:
1516      raise ValueError(
1517          f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).'
1518      )
1519    ffmpeg_args = (
1520        shlex.split(ffmpeg_args)
1521        if isinstance(ffmpeg_args, str)
1522        else list(ffmpeg_args)
1523    )
1524    if input_format not in {'rgb', 'yuv', 'gray'}:
1525      raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.')
1526    dtype = np.dtype(dtype)
1527    if dtype.type not in (np.uint8, np.uint16):
1528      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1529    self.path = pathlib.Path(path)
1530    self.shape = shape
1531    all_dimensions_are_even = all(dim % 2 == 0 for dim in shape)
1532    if encoded_format is None:
1533      encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p'
1534    if not all_dimensions_are_even and encoded_format.startswith(
1535        ('yuv42', 'yuvj42')
1536    ):
1537      raise ValueError(
1538          f'With encoded_format {encoded_format}, video dimensions must be'
1539          f' even, but shape is {shape}.'
1540      )
1541    self.fps = fps
1542    self.codec = codec
1543    self.bps = bps
1544    self.qp = qp
1545    self.crf = crf
1546    self.ffmpeg_args = ffmpeg_args
1547    self.input_format = input_format
1548    self.dtype = dtype
1549    self.encoded_format = encoded_format
1550    if num_rate_specifications == 0 and not ffmpeg_args:
1551      qp = 20 if math.prod(self.shape) <= 640 * 480 else 28
1552    self._bitrate_args = (
1553        (['-vb', f'{bps}'] if bps is not None else [])
1554        + (['-qp', f'{qp}'] if qp is not None else [])
1555        + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else [])
1556    )
1557    if self.codec == 'gif':
1558      if self.path.suffix != '.gif':
1559        raise ValueError(f"File '{self.path}' does not have a .gif suffix.")
1560      self.encoded_format = 'pal8'
1561      self._bitrate_args = []
1562      video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse'
1563      # Less common (and likely less useful) is a per-frame color palette:
1564      # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];'
1565      #                 '[s1][p]paletteuse=new=1')
1566      self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args
1567    self._write_via_local_file: Any = None
1568    self._popen: subprocess.Popen[bytes] | None = None
1569    self._proc: subprocess.Popen[bytes] | None = None
1570
1571  def __enter__(self) -> 'VideoWriter':
1572    ffmpeg_path = _get_ffmpeg_path()
1573    input_pix_fmt = self._get_pix_fmt(self.dtype, self.input_format)
1574    try:
1575      self._write_via_local_file = _write_via_local_file(self.path)
1576      # pylint: disable-next=no-member
1577      tmp_name = self._write_via_local_file.__enter__()
1578
1579      # Writing to stdout using ('-f', 'mp4', '-') would require
1580      # ('-movflags', 'frag_keyframe+empty_moov') and the result is nonportable.
1581      height, width = self.shape
1582      command = (
1583          [
1584              ffmpeg_path,
1585              '-v',
1586              'error',
1587              '-f',
1588              'rawvideo',
1589              '-vcodec',
1590              'rawvideo',
1591              '-pix_fmt',
1592              input_pix_fmt,
1593              '-s',
1594              f'{width}x{height}',
1595              '-r',
1596              f'{self.fps}',
1597              '-i',
1598              '-',
1599              '-an',
1600              '-vcodec',
1601              self.codec,
1602              '-pix_fmt',
1603              self.encoded_format,
1604          ]
1605          + self._bitrate_args
1606          + self.ffmpeg_args
1607          + ['-y', tmp_name]
1608      )
1609      self._popen = subprocess.Popen(
1610          command, stdin=subprocess.PIPE, stderr=subprocess.PIPE
1611      )
1612      self._proc = self._popen.__enter__()
1613    except Exception:
1614      self.__exit__(None, None, None)
1615      raise
1616    return self
1617
1618  def __exit__(self, *_: Any) -> None:
1619    self.close()
1620
1621  def add_image(self, image: _NDArray) -> None:
1622    """Writes a video frame.
1623
1624    Args:
1625      image: Array whose dtype and first two dimensions must match the `dtype`
1626        and `shape` specified in `VideoWriter` initialization.  If
1627        `input_format` is 'gray', the image must be 2D.  For the 'rgb'
1628        input_format, the image may be either 2D (interpreted as grayscale) or
1629        3D with three (R, G, B) channels.  For the 'yuv' input_format, the image
1630        must be 3D with three (Y, U, V) channels.
1631
1632    Raises:
1633      RuntimeError: If there is an error writing to the output file.
1634    """
1635    assert self._proc, 'Error: writing to an already closed context.'
1636    if issubclass(image.dtype.type, (np.floating, np.bool_)):
1637      image = to_type(image, self.dtype)
1638    if image.dtype != self.dtype:
1639      raise ValueError(f'Image type {image.dtype} != {self.dtype}.')
1640    if self.input_format == 'gray':
1641      if image.ndim != 2:
1642        raise ValueError(f'Image dimensions {image.shape} are not 2D.')
1643    else:
1644      if image.ndim == 2 and self.input_format == 'rgb':
1645        image = np.dstack((image, image, image))
1646      if not (image.ndim == 3 and image.shape[2] == 3):
1647        raise ValueError(f'Image dimensions {image.shape} are invalid.')
1648    if image.shape[:2] != self.shape:
1649      raise ValueError(
1650          f'Image dimensions {image.shape[:2]} do not match'
1651          f' those of the initialized video {self.shape}.'
1652      )
1653    if self.input_format == 'yuv':  # Convert from per-pixel YUV to planar YUV.
1654      image = np.moveaxis(image, 2, 0)
1655    data = image.tobytes()
1656    stdin = self._proc.stdin
1657    assert stdin is not None
1658    if stdin.write(data) != len(data):
1659      self._proc.wait()
1660      stderr = self._proc.stderr
1661      assert stderr is not None
1662      s = stderr.read().decode()
1663      raise RuntimeError(f"Error writing '{self.path}': {s}")
1664
1665  def close(self) -> None:
1666    """Finishes writing the video.  (Called automatically at end of context.)"""
1667    if self._popen:
1668      assert self._proc, 'Error: closing an already closed context.'
1669      stdin = self._proc.stdin
1670      assert stdin is not None
1671      stdin.close()
1672      if self._proc.wait():
1673        stderr = self._proc.stderr
1674        assert stderr is not None
1675        s = stderr.read().decode()
1676        raise RuntimeError(f"Error writing '{self.path}': {s}")
1677      self._popen.__exit__(None, None, None)
1678      self._popen = None
1679      self._proc = None
1680    if self._write_via_local_file:
1681      # pylint: disable-next=no-member
1682      self._write_via_local_file.__exit__(None, None, None)
1683      self._write_via_local_file = None

Context to write a compressed video.

>>> shape = 480, 640
>>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer:
...   for image in moving_circle(shape, num_images=60):
...     writer.add_image(image)
>>> show_video(read_video('/tmp/v.mp4'))

Bitrate control may be specified using at most one of: bps, qp, or crf. If none are specified, qp is set to a default value. See https://slhck.info/video/2017/03/01/rate-control.html

If codec is 'gif', the args bps, qp, crf, and encoded_format are ignored.

Attributes:
  • path: Output video. Its suffix (e.g. '.mp4') determines the video container format. The suffix must be '.gif' if the codec is 'gif'.
  • shape: 2D spatial dimensions (height, width) of video image frames. The dimensions must be even if 'encoded_format' has subsampled chroma (e.g., 'yuv420p' or 'yuv420p10le').
  • codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264', 'hevc', 'vp9', or 'gif').
  • metadata: Optional VideoMetadata object whose fps and bps attributes are used if not specified as explicit parameters.
  • fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif').
  • bps: Requested average bits-per-second bitrate (default None).
  • qp: Quantization parameter for video compression quality (default None).
  • crf: Constant rate factor for video compression quality (default None).
  • ffmpeg_args: Additional arguments for ffmpeg command, e.g. '-g 30' to introduce I-frames, or '-bf 0' to omit B-frames.
  • input_format: Format of input images (default 'rgb'). If 'rgb', each image has shape=(height, width, 3) or (height, width). If 'yuv', each image has shape=(height, width, 3) with Y, U, V values. If 'gray', each image has shape=(height, width).
  • dtype: Expected data type for input images (any float input images are converted to dtype). The default is np.uint8. Use of np.uint16 is necessary when encoding >8 bits/channel.
  • encoded_format: Pixel format as defined by ffmpeg -pix_fmts, e.g., 'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma), 'yuv420p10le' (10-bit per channel), etc. The default (None) selects 'yuv420p' if all shape dimensions are even, else 'yuv444p'.
VideoWriter( path: Union[str, os.PathLike[str]], shape: tuple[int, int], *, codec: str = 'h264', metadata: VideoMetadata | None = None, fps: float | None = None, bps: int | None = None, qp: int | None = None, crf: float | None = None, ffmpeg_args: str | Sequence[str] = '', input_format: str = 'rgb', dtype: DTypeLike = <class 'numpy.uint8'>, encoded_format: str | None = None)
1482  def __init__(
1483      self,
1484      path: _Path,
1485      shape: tuple[int, int],
1486      *,
1487      codec: str = 'h264',
1488      metadata: VideoMetadata | None = None,
1489      fps: float | None = None,
1490      bps: int | None = None,
1491      qp: int | None = None,
1492      crf: float | None = None,
1493      ffmpeg_args: str | Sequence[str] = '',
1494      input_format: str = 'rgb',
1495      dtype: _DTypeLike = np.uint8,
1496      encoded_format: str | None = None,
1497  ) -> None:
1498    _check_2d_shape(shape)
1499    if fps is None and metadata:
1500      fps = metadata.fps
1501    if fps is None:
1502      fps = 25.0 if codec == 'gif' else 60.0
1503    if fps <= 0.0:
1504      raise ValueError(f'Frame-per-second value {fps} is invalid.')
1505    if bps is None and metadata:
1506      bps = metadata.bps
1507    bps = int(bps) if bps is not None else None
1508    if bps is not None and bps <= 0:
1509      raise ValueError(f'Bitrate value {bps} is invalid.')
1510    if qp is not None and (not isinstance(qp, int) or qp <= 0):
1511      raise ValueError(
1512          f'Quantization parameter {qp} is not a positive integer.'
1513      )
1514    num_rate_specifications = sum(x is not None for x in (bps, qp, crf))
1515    if num_rate_specifications > 1:
1516      raise ValueError(
1517          f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).'
1518      )
1519    ffmpeg_args = (
1520        shlex.split(ffmpeg_args)
1521        if isinstance(ffmpeg_args, str)
1522        else list(ffmpeg_args)
1523    )
1524    if input_format not in {'rgb', 'yuv', 'gray'}:
1525      raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.')
1526    dtype = np.dtype(dtype)
1527    if dtype.type not in (np.uint8, np.uint16):
1528      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1529    self.path = pathlib.Path(path)
1530    self.shape = shape
1531    all_dimensions_are_even = all(dim % 2 == 0 for dim in shape)
1532    if encoded_format is None:
1533      encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p'
1534    if not all_dimensions_are_even and encoded_format.startswith(
1535        ('yuv42', 'yuvj42')
1536    ):
1537      raise ValueError(
1538          f'With encoded_format {encoded_format}, video dimensions must be'
1539          f' even, but shape is {shape}.'
1540      )
1541    self.fps = fps
1542    self.codec = codec
1543    self.bps = bps
1544    self.qp = qp
1545    self.crf = crf
1546    self.ffmpeg_args = ffmpeg_args
1547    self.input_format = input_format
1548    self.dtype = dtype
1549    self.encoded_format = encoded_format
1550    if num_rate_specifications == 0 and not ffmpeg_args:
1551      qp = 20 if math.prod(self.shape) <= 640 * 480 else 28
1552    self._bitrate_args = (
1553        (['-vb', f'{bps}'] if bps is not None else [])
1554        + (['-qp', f'{qp}'] if qp is not None else [])
1555        + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else [])
1556    )
1557    if self.codec == 'gif':
1558      if self.path.suffix != '.gif':
1559        raise ValueError(f"File '{self.path}' does not have a .gif suffix.")
1560      self.encoded_format = 'pal8'
1561      self._bitrate_args = []
1562      video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse'
1563      # Less common (and likely less useful) is a per-frame color palette:
1564      # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];'
1565      #                 '[s1][p]paletteuse=new=1')
1566      self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args
1567    self._write_via_local_file: Any = None
1568    self._popen: subprocess.Popen[bytes] | None = None
1569    self._proc: subprocess.Popen[bytes] | None = None
path
shape
fps
codec
bps
qp
crf
ffmpeg_args
input_format
dtype
encoded_format
def add_image(self, image: np.ndarray) -> None:
1621  def add_image(self, image: _NDArray) -> None:
1622    """Writes a video frame.
1623
1624    Args:
1625      image: Array whose dtype and first two dimensions must match the `dtype`
1626        and `shape` specified in `VideoWriter` initialization.  If
1627        `input_format` is 'gray', the image must be 2D.  For the 'rgb'
1628        input_format, the image may be either 2D (interpreted as grayscale) or
1629        3D with three (R, G, B) channels.  For the 'yuv' input_format, the image
1630        must be 3D with three (Y, U, V) channels.
1631
1632    Raises:
1633      RuntimeError: If there is an error writing to the output file.
1634    """
1635    assert self._proc, 'Error: writing to an already closed context.'
1636    if issubclass(image.dtype.type, (np.floating, np.bool_)):
1637      image = to_type(image, self.dtype)
1638    if image.dtype != self.dtype:
1639      raise ValueError(f'Image type {image.dtype} != {self.dtype}.')
1640    if self.input_format == 'gray':
1641      if image.ndim != 2:
1642        raise ValueError(f'Image dimensions {image.shape} are not 2D.')
1643    else:
1644      if image.ndim == 2 and self.input_format == 'rgb':
1645        image = np.dstack((image, image, image))
1646      if not (image.ndim == 3 and image.shape[2] == 3):
1647        raise ValueError(f'Image dimensions {image.shape} are invalid.')
1648    if image.shape[:2] != self.shape:
1649      raise ValueError(
1650          f'Image dimensions {image.shape[:2]} do not match'
1651          f' those of the initialized video {self.shape}.'
1652      )
1653    if self.input_format == 'yuv':  # Convert from per-pixel YUV to planar YUV.
1654      image = np.moveaxis(image, 2, 0)
1655    data = image.tobytes()
1656    stdin = self._proc.stdin
1657    assert stdin is not None
1658    if stdin.write(data) != len(data):
1659      self._proc.wait()
1660      stderr = self._proc.stderr
1661      assert stderr is not None
1662      s = stderr.read().decode()
1663      raise RuntimeError(f"Error writing '{self.path}': {s}")

Writes a video frame.

Arguments:
  • image: Array whose dtype and first two dimensions must match the dtype and shape specified in VideoWriter initialization. If input_format is 'gray', the image must be 2D. For the 'rgb' input_format, the image may be either 2D (interpreted as grayscale) or 3D with three (R, G, B) channels. For the 'yuv' input_format, the image must be 3D with three (Y, U, V) channels.
Raises:
  • RuntimeError: If there is an error writing to the output file.
def close(self) -> None:
1665  def close(self) -> None:
1666    """Finishes writing the video.  (Called automatically at end of context.)"""
1667    if self._popen:
1668      assert self._proc, 'Error: closing an already closed context.'
1669      stdin = self._proc.stdin
1670      assert stdin is not None
1671      stdin.close()
1672      if self._proc.wait():
1673        stderr = self._proc.stderr
1674        assert stderr is not None
1675        s = stderr.read().decode()
1676        raise RuntimeError(f"Error writing '{self.path}': {s}")
1677      self._popen.__exit__(None, None, None)
1678      self._popen = None
1679      self._proc = None
1680    if self._write_via_local_file:
1681      # pylint: disable-next=no-member
1682      self._write_via_local_file.__exit__(None, None, None)
1683      self._write_via_local_file = None

Finishes writing the video. (Called automatically at end of context.)

class VideoMetadata(typing.NamedTuple):
1186class VideoMetadata(NamedTuple):
1187  """Represents the data stored in a video container header.
1188
1189  Attributes:
1190    num_images: Number of frames that is expected from the video stream.  This
1191      is estimated from the framerate and the duration stored in the video
1192      header, so it might be inexact.  We set the value to -1 if number of
1193      frames is not found in the header.
1194    shape: The dimensions (height, width) of each video frame.
1195    fps: The framerate in frames per second.
1196    bps: The estimated bitrate of the video stream in bits per second, retrieved
1197      from the video header.
1198  """
1199
1200  num_images: int
1201  shape: tuple[int, int]
1202  fps: float
1203  bps: int | None

Represents the data stored in a video container header.

Attributes:
  • num_images: Number of frames that is expected from the video stream. This is estimated from the framerate and the duration stored in the video header, so it might be inexact. We set the value to -1 if number of frames is not found in the header.
  • shape: The dimensions (height, width) of each video frame.
  • fps: The framerate in frames per second.
  • bps: The estimated bitrate of the video stream in bits per second, retrieved from the video header.
def compress_image(image: ArrayLike, *, fmt: str = 'png', **kwargs: Any) -> bytes:
856def compress_image(
857    image: _ArrayLike, *, fmt: str = 'png', **kwargs: Any
858) -> bytes:
859  """Returns a buffer containing a compressed image.
860
861  Args:
862    image: Array in a format supported by `PIL`, e.g. np.uint8 or np.uint16.
863    fmt: Desired compression encoding, e.g. 'png'.
864    **kwargs: Options for `PIL.save()`, e.g. `optimize=True` for greater
865      compression.
866  """
867  image = _as_valid_media_array(image)
868  with io.BytesIO() as output:
869    _pil_image(image).save(output, format=fmt, **kwargs)
870    return output.getvalue()

Returns a buffer containing a compressed image.

Arguments:
  • image: Array in a format supported by PIL, e.g. np.uint8 or np.uint16.
  • fmt: Desired compression encoding, e.g. 'png'.
  • **kwargs: Options for PIL.save(), e.g. optimize=True for greater compression.
def decompress_image( data: bytes, dtype: DTypeLike = None, apply_exif_transpose: bool = True) -> np.ndarray:
873def decompress_image(
874    data: bytes, dtype: _DTypeLike = None, apply_exif_transpose: bool = True
875) -> _NDArray:
876  """Returns an image from a compressed data buffer.
877
878  Decoding is performed using `PIL`, which supports `uint8` images with 1, 3,
879  or 4 channels and `uint16` images with a single channel.
880
881  Args:
882    data: Buffer containing compressed image.
883    dtype: Data type of the returned array.  If None, `np.uint8` or `np.uint16`
884      is inferred automatically.
885    apply_exif_transpose: If True, rotate image according to EXIF orientation.
886  """
887  pil_image = PIL.Image.open(io.BytesIO(data))
888  if apply_exif_transpose:
889    tmp_image = PIL.ImageOps.exif_transpose(pil_image)  # Future: in_place=True.
890    assert tmp_image
891    pil_image = tmp_image
892  if dtype is None:
893    dtype = np.uint16 if pil_image.mode.startswith('I') else np.uint8
894  return np.array(pil_image, dtype=dtype)

Returns an image from a compressed data buffer.

Decoding is performed using PIL, which supports uint8 images with 1, 3, or 4 channels and uint16 images with a single channel.

Arguments:
  • data: Buffer containing compressed image.
  • dtype: Data type of the returned array. If None, np.uint8 or np.uint16 is inferred automatically.
  • apply_exif_transpose: If True, rotate image according to EXIF orientation.
def compress_video( images: Iterable[np.ndarray], *, codec: str = 'h264', **kwargs: Any) -> bytes:
1757def compress_video(
1758    images: Iterable[_NDArray], *, codec: str = 'h264', **kwargs: Any
1759) -> bytes:
1760  """Returns a buffer containing a compressed video.
1761
1762  The video container is 'mp4' except when `codec` is 'gif'.
1763
1764  >>> video = read_video('/tmp/river.mp4')
1765  >>> data = compress_video(video, bps=10_000_000)
1766  >>> print(len(data))
1767
1768  >>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10)
1769
1770  Args:
1771    images: Iterable over video frames.
1772    codec: Compression algorithm as defined by `ffmpeg -codecs` (e.g., 'h264',
1773      'hevc', 'vp9', or 'gif').
1774    **kwargs: Additional parameters for `VideoWriter`.
1775
1776  Returns:
1777    A bytes buffer containing the compressed video.
1778  """
1779  suffix = _filename_suffix_from_codec(codec)
1780  with tempfile.TemporaryDirectory() as directory_name:
1781    tmp_path = pathlib.Path(directory_name) / f'file{suffix}'
1782    write_video(tmp_path, images, codec=codec, **kwargs)
1783    return tmp_path.read_bytes()

Returns a buffer containing a compressed video.

The video container is 'mp4' except when codec is 'gif'.

>>> video = read_video('/tmp/river.mp4')
>>> data = compress_video(video, bps=10_000_000)
>>> print(len(data))
>>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10)
Arguments:
  • images: Iterable over video frames.
  • codec: Compression algorithm as defined by ffmpeg -codecs (e.g., 'h264', 'hevc', 'vp9', or 'gif').
  • **kwargs: Additional parameters for VideoWriter.
Returns:

A bytes buffer containing the compressed video.

def decompress_video(data: bytes, **kwargs: Any) -> np.ndarray:
1786def decompress_video(data: bytes, **kwargs: Any) -> _NDArray:
1787  """Returns video images from an MP4-compressed data buffer."""
1788  with tempfile.TemporaryDirectory() as directory_name:
1789    tmp_path = pathlib.Path(directory_name) / 'file.mp4'
1790    tmp_path.write_bytes(data)
1791    return read_video(tmp_path, **kwargs)

Returns video images from an MP4-compressed data buffer.

def html_from_compressed_image( data: bytes, width: int, height: int, *, title: str | None = None, border: bool | str = False, pixelated: bool = True, fmt: str = 'png') -> str:
897def html_from_compressed_image(
898    data: bytes,
899    width: int,
900    height: int,
901    *,
902    title: str | None = None,
903    border: bool | str = False,
904    pixelated: bool = True,
905    fmt: str = 'png',
906) -> str:
907  """Returns an HTML string with an image tag containing encoded data.
908
909  Args:
910    data: Compressed image bytes.
911    width: Width of HTML image in pixels.
912    height: Height of HTML image in pixels.
913    title: Optional text shown centered above image.
914    border: If `bool`, whether to place a black boundary around the image, or if
915      `str`, the boundary CSS style.
916    pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'.
917    fmt: Compression encoding.
918  """
919  b64 = base64.b64encode(data).decode('utf-8')
920  if isinstance(border, str):
921    border = f'{border}; '
922  elif border:
923    border = 'border:1px solid black; '
924  else:
925    border = ''
926  s_pixelated = 'pixelated' if pixelated else 'auto'
927  s = (
928      f'<img width="{width}" height="{height}"'
929      f' style="{border}image-rendering:{s_pixelated}; object-fit:cover;"'
930      f' src="data:image/{fmt};base64,{b64}"/>'
931  )
932  if title is not None:
933    s = f"""<div style="display:flex; align-items:left;">
934      <div style="display:flex; flex-direction:column; align-items:center;">
935      <div>{title}</div><div>{s}</div></div></div>"""
936  return s

Returns an HTML string with an image tag containing encoded data.

Arguments:
  • data: Compressed image bytes.
  • width: Width of HTML image in pixels.
  • height: Height of HTML image in pixels.
  • title: Optional text shown centered above image.
  • border: If bool, whether to place a black boundary around the image, or if str, the boundary CSS style.
  • pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'.
  • fmt: Compression encoding.
def html_from_compressed_video( data: bytes, width: int, height: int, *, title: str | None = None, border: bool | str = False, loop: bool = True, autoplay: bool = True) -> str:
1794def html_from_compressed_video(
1795    data: bytes,
1796    width: int,
1797    height: int,
1798    *,
1799    title: str | None = None,
1800    border: bool | str = False,
1801    loop: bool = True,
1802    autoplay: bool = True,
1803) -> str:
1804  """Returns an HTML string with a video tag containing H264-encoded data.
1805
1806  Args:
1807    data: MP4-compressed video bytes.
1808    width: Width of HTML video in pixels.
1809    height: Height of HTML video in pixels.
1810    title: Optional text shown centered above the video.
1811    border: If `bool`, whether to place a black boundary around the image, or if
1812      `str`, the boundary CSS style.
1813    loop: If True, the playback repeats forever.
1814    autoplay: If True, video playback starts without having to click.
1815  """
1816  b64 = base64.b64encode(data).decode('utf-8')
1817  if isinstance(border, str):
1818    border = f'{border}; '
1819  elif border:
1820    border = 'border:1px solid black; '
1821  else:
1822    border = ''
1823  options = (
1824      f'controls width="{width}" height="{height}"'
1825      f' style="{border}object-fit:cover;"'
1826      f'{" loop" if loop else ""}'
1827      f'{" autoplay muted" if autoplay else ""}'
1828  )
1829  s = f"""<video {options}>
1830      <source src="data:video/mp4;base64,{b64}" type="video/mp4"/>
1831      This browser does not support the video tag.
1832      </video>"""
1833  if title is not None:
1834    s = f"""<div style="display:flex; align-items:left;">
1835      <div style="display:flex; flex-direction:column; align-items:center;">
1836      <div>{title}</div><div>{s}</div></div></div>"""
1837  return s

Returns an HTML string with a video tag containing H264-encoded data.

Arguments:
  • data: MP4-compressed video bytes.
  • width: Width of HTML video in pixels.
  • height: Height of HTML video in pixels.
  • title: Optional text shown centered above the video.
  • border: If bool, whether to place a black boundary around the image, or if str, the boundary CSS style.
  • loop: If True, the playback repeats forever.
  • autoplay: If True, video playback starts without having to click.
def resize_image(image: ArrayLike, shape: tuple[int, int]) -> np.ndarray:
614def resize_image(image: _ArrayLike, shape: tuple[int, int]) -> _NDArray:
615  """Resizes image to specified spatial dimensions using a Lanczos filter.
616
617  Args:
618    image: Array-like 2D or 3D object, where dtype is uint or floating-point.
619    shape: 2D spatial dimensions (height, width) of output image.
620
621  Returns:
622    A resampled image whose spatial dimensions match `shape`.
623  """
624  image = _as_valid_media_array(image)
625  if image.ndim not in (2, 3):
626    raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.')
627  _check_2d_shape(shape)
628
629  # A PIL image can be multichannel only if it has 3 or 4 uint8 channels,
630  # and it can be resized only if it is uint8 or float32.
631  supported_single_channel = (
632      np.issubdtype(image.dtype, np.floating) or image.dtype == np.uint8
633  ) and image.ndim == 2
634  supported_multichannel = (
635      image.dtype == np.uint8 and image.ndim == 3 and image.shape[2] in (3, 4)
636  )
637  if supported_single_channel or supported_multichannel:
638    return np.array(
639        _pil_image(image).resize(
640            shape[::-1], resample=PIL.Image.Resampling.LANCZOS
641        ),
642        dtype=image.dtype,
643    )
644  if image.ndim == 2:
645    # We convert to floating-point for resizing and convert back.
646    return to_type(resize_image(to_float01(image), shape), image.dtype)
647  # We resize each image channel individually.
648  return np.dstack(
649      [resize_image(channel, shape) for channel in np.moveaxis(image, -1, 0)]
650  )

Resizes image to specified spatial dimensions using a Lanczos filter.

Arguments:
  • image: Array-like 2D or 3D object, where dtype is uint or floating-point.
  • shape: 2D spatial dimensions (height, width) of output image.
Returns:

A resampled image whose spatial dimensions match shape.

def resize_video( video: Iterable[np.ndarray], shape: tuple[int, int]) -> np.ndarray:
656def resize_video(video: Iterable[_NDArray], shape: tuple[int, int]) -> _NDArray:
657  """Resizes `video` to specified spatial dimensions using a Lanczos filter.
658
659  Args:
660    video: Iterable of images.
661    shape: 2D spatial dimensions (height, width) of output video.
662
663  Returns:
664    A resampled video whose spatial dimensions match `shape`.
665  """
666  _check_2d_shape(shape)
667  return np.array([resize_image(image, shape) for image in video])

Resizes video to specified spatial dimensions using a Lanczos filter.

Arguments:
  • video: Iterable of images.
  • shape: 2D spatial dimensions (height, width) of output video.
Returns:

A resampled video whose spatial dimensions match shape.

def to_rgb( array: ArrayLike, *, vmin: float | None = None, vmax: float | None = None, cmap: str | Callable[[ArrayLike], np.ndarray] = 'gray') -> np.ndarray:
812def to_rgb(
813    array: _ArrayLike,
814    *,
815    vmin: float | None = None,
816    vmax: float | None = None,
817    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
818) -> _NDArray:
819  """Maps scalar values to RGB using value bounds and a color map.
820
821  Args:
822    array: Scalar values, with arbitrary shape.
823    vmin: Explicit min value for remapping; if None, it is obtained as the
824      minimum finite value of `array`.
825    vmax: Explicit max value for remapping; if None, it is obtained as the
826      maximum finite value of `array`.
827    cmap: A `pyplot` color map or callable, to map from 1D value to 3D or 4D
828      color.
829
830  Returns:
831    A new array in which each element is affinely mapped from [vmin, vmax]
832    to [0.0, 1.0] and then color-mapped.
833  """
834  a = _as_valid_media_array(array)
835  del array
836  # For future numpy version 1.7.0:
837  # vmin = np.amin(a, where=np.isfinite(a)) if vmin is None else vmin
838  # vmax = np.amax(a, where=np.isfinite(a)) if vmax is None else vmax
839  vmin = np.amin(np.where(np.isfinite(a), a, np.inf)) if vmin is None else vmin
840  vmax = np.amax(np.where(np.isfinite(a), a, -np.inf)) if vmax is None else vmax
841  a = (a.astype('float') - vmin) / (vmax - vmin + np.finfo(float).eps)
842  if isinstance(cmap, str):
843    if hasattr(matplotlib, 'colormaps'):
844      rgb_from_scalar: Any = matplotlib.colormaps[cmap]  # Newer version.
845    else:
846      rgb_from_scalar = matplotlib.pyplot.cm.get_cmap(cmap)  # type: ignore # pylint: disable=no-member
847  else:
848    rgb_from_scalar = cmap
849  a = rgb_from_scalar(a)
850  # If there is a fully opaque alpha channel, remove it.
851  if a.shape[-1] == 4 and np.all(to_float01(a[..., 3])) == 1.0:
852    a = a[..., :3]
853  return a

Maps scalar values to RGB using value bounds and a color map.

Arguments:
  • array: Scalar values, with arbitrary shape.
  • vmin: Explicit min value for remapping; if None, it is obtained as the minimum finite value of array.
  • vmax: Explicit max value for remapping; if None, it is obtained as the maximum finite value of array.
  • cmap: A pyplot color map or callable, to map from 1D value to 3D or 4D color.
Returns:

A new array in which each element is affinely mapped from [vmin, vmax] to [0.0, 1.0] and then color-mapped.

def to_type(array: ArrayLike, dtype: DTypeLike) -> np.ndarray:
375def to_type(array: _ArrayLike, dtype: _DTypeLike) -> _NDArray:
376  """Returns media array converted to specified type.
377
378  A "media array" is one in which the dtype is either a floating-point type
379  (np.float32 or np.float64) or an unsigned integer type.  The array values are
380  assumed to lie in the range [0.0, 1.0] for floating-point values, and in the
381  full range for unsigned integers, e.g. [0, 255] for np.uint8.
382
383  Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to
384  1.0.  The input array may also be of type bool, whereby True maps to
385  uint(MAX) or 1.0.  The values are scaled and clamped as appropriate during
386  type conversions.
387
388  Args:
389    array: Input array-like object (floating-point, unsigned int, or bool).
390    dtype: Desired output type (floating-point or unsigned int).
391
392  Returns:
393    Array `a` if it is already of the specified dtype, else a converted array.
394  """
395  a = np.asarray(array)
396  dtype = np.dtype(dtype)
397  del array
398  if a.dtype != bool:
399    _as_valid_media_type(a.dtype)  # Verify that 'a' has a valid dtype.
400  if a.dtype == bool:
401    result = a.astype(dtype)
402    if np.issubdtype(dtype, np.unsignedinteger):
403      result = result * dtype.type(np.iinfo(dtype).max)
404  elif a.dtype == dtype:
405    result = a
406  elif np.issubdtype(dtype, np.unsignedinteger):
407    if np.issubdtype(a.dtype, np.unsignedinteger):
408      src_max: float = np.iinfo(a.dtype).max
409    else:
410      a = np.clip(a, 0.0, 1.0)
411      src_max = 1.0
412    dst_max = np.iinfo(dtype).max
413    if dst_max <= np.iinfo(np.uint16).max:
414      scale = np.array(dst_max / src_max, dtype=np.float32)
415      result = (a * scale + 0.5).astype(dtype)
416    elif dst_max <= np.iinfo(np.uint32).max:
417      result = (a.astype(np.float64) * (dst_max / src_max) + 0.5).astype(dtype)
418    else:
419      # https://stackoverflow.com/a/66306123/
420      a = a.astype(np.float64) * (dst_max / src_max) + 0.5
421      dst = np.atleast_1d(a)
422      values_too_large = dst >= np.float64(dst_max)
423      with np.errstate(invalid='ignore'):
424        dst = dst.astype(dtype)
425      dst[values_too_large] = dst_max
426      result = dst if a.ndim > 0 else dst[0]
427  else:
428    assert np.issubdtype(dtype, np.floating)
429    result = a.astype(dtype)
430    if np.issubdtype(a.dtype, np.unsignedinteger):
431      result = result / dtype.type(np.iinfo(a.dtype).max)
432  return result

Returns media array converted to specified type.

A "media array" is one in which the dtype is either a floating-point type (np.float32 or np.float64) or an unsigned integer type. The array values are assumed to lie in the range [0.0, 1.0] for floating-point values, and in the full range for unsigned integers, e.g. [0, 255] for np.uint8.

Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to 1.0. The input array may also be of type bool, whereby True maps to uint(MAX) or 1.0. The values are scaled and clamped as appropriate during type conversions.

Arguments:
  • array: Input array-like object (floating-point, unsigned int, or bool).
  • dtype: Desired output type (floating-point or unsigned int).
Returns:

Array a if it is already of the specified dtype, else a converted array.

def to_float01( a: ArrayLike, dtype: DTypeLike = <class 'numpy.float32'>) -> np.ndarray:
435def to_float01(a: _ArrayLike, dtype: _DTypeLike = np.float32) -> _NDArray:
436  """If array has unsigned integers, rescales them to the range [0.0, 1.0].
437
438  Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0.  See
439  `to_type`.
440
441  Args:
442    a: Input array.
443    dtype: Desired floating-point type if rescaling occurs.
444
445  Returns:
446    A new array of dtype values in the range [0.0, 1.0] if the input array `a`
447    contains unsigned integers; otherwise, array `a` is returned unchanged.
448  """
449  a = np.asarray(a)
450  dtype = np.dtype(dtype)
451  if not np.issubdtype(dtype, np.floating):
452    raise ValueError(f'Type {dtype} is not floating-point.')
453  if np.issubdtype(a.dtype, np.floating):
454    return a
455  return to_type(a, dtype)

If array has unsigned integers, rescales them to the range [0.0, 1.0].

Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0. See to_type.

Arguments:
  • a: Input array.
  • dtype: Desired floating-point type if rescaling occurs.
Returns:

A new array of dtype values in the range [0.0, 1.0] if the input array a contains unsigned integers; otherwise, array a is returned unchanged.

def to_uint8(a: ArrayLike) -> np.ndarray:
458def to_uint8(a: _ArrayLike) -> _NDArray:
459  """Returns array converted to uint8 values; see `to_type`."""
460  return to_type(a, np.uint8)

Returns array converted to uint8 values; see to_type.

def set_output_height(num_pixels: int) -> None:
328def set_output_height(num_pixels: int) -> None:
329  """Overrides the height of the current output cell, if using Colab."""
330  try:
331    # We want to fail gracefully for non-Colab IPython notebooks.
332    output = importlib.import_module('google.colab.output')
333    s = f'google.colab.output.setIframeHeight("{num_pixels}px")'
334    output.eval_js(s)
335  except (ModuleNotFoundError, AttributeError):
336    pass

Overrides the height of the current output cell, if using Colab.

def set_max_output_height(num_pixels: int) -> None:
339def set_max_output_height(num_pixels: int) -> None:
340  """Sets the maximum height of the current output cell, if using Colab."""
341  try:
342    # We want to fail gracefully for non-Colab IPython notebooks.
343    output = importlib.import_module('google.colab.output')
344    s = (
345        'google.colab.output.setIframeHeight('
346        f'0, true, {{maxHeight: {num_pixels}}})'
347    )
348    output.eval_js(s)
349  except (ModuleNotFoundError, AttributeError):
350    pass

Sets the maximum height of the current output cell, if using Colab.

def color_ramp( shape: tuple[int, int] = (64, 64), *, dtype: DTypeLike = <class 'numpy.float32'>) -> np.ndarray:
466def color_ramp(
467    shape: tuple[int, int] = (64, 64), *, dtype: _DTypeLike = np.float32
468) -> _NDArray:
469  """Returns an image of a red-green color gradient.
470
471  This is useful for quick experimentation and testing.  See also
472  `moving_circle` to generate a sample video.
473
474  Args:
475    shape: 2D spatial dimensions (height, width) of generated image.
476    dtype: Type (uint or floating) of resulting pixel values.
477  """
478  _check_2d_shape(shape)
479  dtype = _as_valid_media_type(dtype)
480  yx = (np.moveaxis(np.indices(shape), 0, -1) + 0.5) / shape
481  image = np.insert(yx, 2, 0.0, axis=-1)
482  return to_type(image, dtype)

Returns an image of a red-green color gradient.

This is useful for quick experimentation and testing. See also moving_circle to generate a sample video.

Arguments:
  • shape: 2D spatial dimensions (height, width) of generated image.
  • dtype: Type (uint or floating) of resulting pixel values.
def moving_circle( shape: tuple[int, int] = (256, 256), num_images: int = 10, *, dtype: DTypeLike = <class 'numpy.float32'>) -> np.ndarray:
485def moving_circle(
486    shape: tuple[int, int] = (256, 256),
487    num_images: int = 10,
488    *,
489    dtype: _DTypeLike = np.float32,
490) -> _NDArray:
491  """Returns a video of a circle moving in front of a color ramp.
492
493  This is useful for quick experimentation and testing.  See also `color_ramp`
494  to generate a sample image.
495
496  >>> show_video(moving_circle((480, 640), 60), fps=60)
497
498  Args:
499    shape: 2D spatial dimensions (height, width) of generated video.
500    num_images: Number of video frames.
501    dtype: Type (uint or floating) of resulting pixel values.
502  """
503  _check_2d_shape(shape)
504  dtype = np.dtype(dtype)
505
506  def generate_image(image_index: int) -> _NDArray:
507    """Returns a video frame image."""
508    image = color_ramp(shape, dtype=dtype)
509    yx = np.moveaxis(np.indices(shape), 0, -1)
510    center = shape[0] * 0.6, shape[1] * (image_index + 0.5) / num_images
511    radius_squared = (min(shape) * 0.1) ** 2
512    inside = np.sum((yx - center) ** 2, axis=-1) < radius_squared
513    white_circle_color = 1.0, 1.0, 1.0
514    if np.issubdtype(dtype, np.unsignedinteger):
515      white_circle_color = to_type([white_circle_color], dtype)[0]
516    image[inside] = white_circle_color
517    return image
518
519  return np.array([generate_image(i) for i in range(num_images)])

Returns a video of a circle moving in front of a color ramp.

This is useful for quick experimentation and testing. See also color_ramp to generate a sample image.

>>> show_video(moving_circle((480, 640), 60), fps=60)
Arguments:
  • shape: 2D spatial dimensions (height, width) of generated video.
  • num_images: Number of video frames.
  • dtype: Type (uint or floating) of resulting pixel values.
class set_show_save_dir:
733class set_show_save_dir:  # pylint: disable=invalid-name
734  """Save all titled output from `show_*()` calls into files.
735
736  If the specified `directory` is not None, all titled images and videos
737  displayed by `show_image`, `show_images`, `show_video`, and `show_videos` are
738  also saved as files within the directory.
739
740  It can be used either to set the state or as a context manager:
741
742  >>> set_show_save_dir('/tmp')
743  >>> show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
744  >>> show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
745  >>> set_show_save_dir(None)
746
747  >>> with set_show_save_dir('/tmp'):
748  ...   show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
749  ...   show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
750  """
751
752  def __init__(self, directory: _Path | None):
753    self._old_show_save_dir = _config.show_save_dir
754    _config.show_save_dir = directory
755
756  def __enter__(self) -> None:
757    pass
758
759  def __exit__(self, *_: Any) -> None:
760    _config.show_save_dir = self._old_show_save_dir

Save all titled output from show_*() calls into files.

If the specified directory is not None, all titled images and videos displayed by show_image, show_images, show_video, and show_videos are also saved as files within the directory.

It can be used either to set the state or as a context manager:

>>> set_show_save_dir('/tmp')
>>> show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
>>> show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
>>> set_show_save_dir(None)
>>> with set_show_save_dir('/tmp'):
...   show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
...   show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
set_show_save_dir(directory: Union[str, os.PathLike[str], NoneType])
752  def __init__(self, directory: _Path | None):
753    self._old_show_save_dir = _config.show_save_dir
754    _config.show_save_dir = directory
def set_ffmpeg(name_or_path: Union[str, os.PathLike[str]]) -> None:
314def set_ffmpeg(name_or_path: _Path) -> None:
315  """Specifies the name or path for the `ffmpeg` external program.
316
317  The `ffmpeg` program is required for compressing and decompressing video.
318  (It is used in `read_video`, `write_video`, `show_video`, `show_videos`,
319  etc.)
320
321  Args:
322    name_or_path: Either a filename within a directory of `os.environ['PATH']`
323      or a filepath.  The default setting is 'ffmpeg'.
324  """
325  _config.ffmpeg_name_or_path = name_or_path

Specifies the name or path for the ffmpeg external program.

The ffmpeg program is required for compressing and decompressing video. (It is used in read_video, write_video, show_video, show_videos, etc.)

Arguments:
  • name_or_path: Either a filename within a directory of os.environ['PATH'] or a filepath. The default setting is 'ffmpeg'.
def video_is_available() -> bool:
1178def video_is_available() -> bool:
1179  """Returns True if the program `ffmpeg` is found.
1180
1181  See also `set_ffmpeg`.
1182  """
1183  return _search_for_ffmpeg_path() is not None

Returns True if the program ffmpeg is found.

See also set_ffmpeg.