mediapy

mediapy: Read/write/show images and videos in an IPython/Jupyter notebook.

[GitHub source]   [API docs]   [PyPI package]   [Colab example]

See the example notebook, or better yet, open it in Colab.

Image examples

Display an image (2D or 3D numpy array):

checkerboard = np.kron([[0, 1] * 16, [1, 0] * 16] * 16, np.ones((4, 4)))
show_image(checkerboard)

Read and display an image (either local or from the Web):

IMAGE = 'https://github.com/hhoppe/data/raw/main/image.png'
show_image(read_image(IMAGE))

Read and display an image from a local file:

!wget -q -O /tmp/burano.png {IMAGE}
show_image(read_image('/tmp/burano.png'))

Show titled images side-by-side:

images = {
    'original': checkerboard,
    'darkened': checkerboard * 0.7,
    'random': np.random.rand(32, 32, 3),
}
show_images(images, vmin=0.0, vmax=1.0, border=True, height=64)

Compare two images using an interactive slider:

compare_images([checkerboard, np.random.rand(128, 128, 3)])

Video examples

Display a video (an iterable of images, e.g., a 3D or 4D array):

video = moving_circle((100, 100), num_images=10)
show_video(video, fps=10)

Show the video frames side-by-side:

show_images(video, columns=6, border=True, height=64)

Show the frames with their indices:

show_images({f'{i}': image for i, image in enumerate(video)}, width=32)

Read and display a video (either local or from the Web):

VIDEO = 'https://github.com/hhoppe/data/raw/main/video.mp4'
show_video(read_video(VIDEO))

Create and display a looping two-frame GIF video:

image1 = resize_image(np.random.rand(10, 10, 3), (50, 50))
show_video([image1, image1 * 0.8], fps=2, codec='gif')

Darken a video frame-by-frame:

output_path = '/tmp/out.mp4'
with VideoReader(VIDEO) as r:
  darken_image = lambda image: to_float01(image) * 0.5
  with VideoWriter(output_path, shape=r.shape, fps=r.fps, bps=r.bps) as w:
    for image in r:
      w.add_image(darken_image(image))
   1# Copyright 2025 The mediapy Authors.
   2#
   3# Licensed under the Apache License, Version 2.0 (the "License");
   4# you may not use this file except in compliance with the License.
   5# You may obtain a copy of the License at
   6#
   7#     http://www.apache.org/licenses/LICENSE-2.0
   8#
   9# Unless required by applicable law or agreed to in writing, software
  10# distributed under the License is distributed on an "AS IS" BASIS,
  11# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  12# See the License for the specific language governing permissions and
  13# limitations under the License.
  14
  15"""`mediapy`: Read/write/show images and videos in an IPython/Jupyter notebook.
  16
  17[**[GitHub source]**](https://github.com/google/mediapy)  
  18[**[API docs]**](https://google.github.io/mediapy/)  
  19[**[PyPI package]**](https://pypi.org/project/mediapy/)  
  20[**[Colab
  21example]**](https://colab.research.google.com/github/google/mediapy/blob/main/mediapy_examples.ipynb)
  22
  23See the [example
  24notebook](https://github.com/google/mediapy/blob/main/mediapy_examples.ipynb),
  25or better yet, [**open it in
  26Colab**](https://colab.research.google.com/github/google/mediapy/blob/main/mediapy_examples.ipynb).
  27
  28## Image examples
  29
  30Display an image (2D or 3D `numpy` array):
  31```python
  32checkerboard = np.kron([[0, 1] * 16, [1, 0] * 16] * 16, np.ones((4, 4)))
  33show_image(checkerboard)
  34```
  35
  36Read and display an image (either local or from the Web):
  37```python
  38IMAGE = 'https://github.com/hhoppe/data/raw/main/image.png'
  39show_image(read_image(IMAGE))
  40```
  41
  42Read and display an image from a local file:
  43```python
  44!wget -q -O /tmp/burano.png {IMAGE}
  45show_image(read_image('/tmp/burano.png'))
  46```
  47
  48Show titled images side-by-side:
  49```python
  50images = {
  51    'original': checkerboard,
  52    'darkened': checkerboard * 0.7,
  53    'random': np.random.rand(32, 32, 3),
  54}
  55show_images(images, vmin=0.0, vmax=1.0, border=True, height=64)
  56```
  57
  58Compare two images using an interactive slider:
  59```python
  60compare_images([checkerboard, np.random.rand(128, 128, 3)])
  61```
  62
  63## Video examples
  64
  65Display a video (an iterable of images, e.g., a 3D or 4D array):
  66```python
  67video = moving_circle((100, 100), num_images=10)
  68show_video(video, fps=10)
  69```
  70
  71Show the video frames side-by-side:
  72```python
  73show_images(video, columns=6, border=True, height=64)
  74```
  75
  76Show the frames with their indices:
  77```python
  78show_images({f'{i}': image for i, image in enumerate(video)}, width=32)
  79```
  80
  81Read and display a video (either local or from the Web):
  82```python
  83VIDEO = 'https://github.com/hhoppe/data/raw/main/video.mp4'
  84show_video(read_video(VIDEO))
  85```
  86
  87Create and display a looping two-frame GIF video:
  88```python
  89image1 = resize_image(np.random.rand(10, 10, 3), (50, 50))
  90show_video([image1, image1 * 0.8], fps=2, codec='gif')
  91```
  92
  93Darken a video frame-by-frame:
  94```python
  95output_path = '/tmp/out.mp4'
  96with VideoReader(VIDEO) as r:
  97  darken_image = lambda image: to_float01(image) * 0.5
  98  with VideoWriter(output_path, shape=r.shape, fps=r.fps, bps=r.bps) as w:
  99    for image in r:
 100      w.add_image(darken_image(image))
 101```
 102"""
 103
 104from __future__ import annotations
 105
 106__docformat__ = 'google'
 107__version__ = '1.2.6'
 108__version_info__ = tuple(int(num) for num in __version__.split('.'))
 109
 110import base64
 111from collections.abc import Callable, Iterable, Iterator, Mapping, Sequence
 112import contextlib
 113import functools
 114import importlib
 115import io
 116import itertools
 117import math
 118import numbers
 119import os  # Package only needed for typing.TYPE_CHECKING.
 120import pathlib
 121import re
 122import shlex
 123import shutil
 124import subprocess
 125import sys
 126import tempfile
 127import typing
 128from typing import Any
 129import urllib.request
 130import warnings
 131
 132import IPython.display
 133import matplotlib.pyplot
 134import numpy as np
 135import numpy.typing as npt
 136import PIL.Image
 137import PIL.ImageOps
 138
 139
 140if not hasattr(PIL.Image, 'Resampling'):  # Allow Pillow<9.0.
 141  PIL.Image.Resampling = PIL.Image  # type: ignore
 142
 143# Selected and reordered here for pdoc documentation.
 144__all__ = [
 145    'show_image',
 146    'show_images',
 147    'compare_images',
 148    'show_video',
 149    'show_videos',
 150    'read_image',
 151    'write_image',
 152    'read_video',
 153    'write_video',
 154    'VideoReader',
 155    'VideoWriter',
 156    'VideoMetadata',
 157    'compress_image',
 158    'decompress_image',
 159    'compress_video',
 160    'decompress_video',
 161    'html_from_compressed_image',
 162    'html_from_compressed_video',
 163    'resize_image',
 164    'resize_video',
 165    'to_rgb',
 166    'to_type',
 167    'to_float01',
 168    'to_uint8',
 169    'set_output_height',
 170    'set_max_output_height',
 171    'color_ramp',
 172    'moving_circle',
 173    'set_show_save_dir',
 174    'set_ffmpeg',
 175    'video_is_available',
 176]
 177
 178if TYPE_CHECKING:
 179  _ArrayLike = npt.ArrayLike
 180  _DTypeLike = npt.DTypeLike
 181  _NDArray = npt.NDArray[Any]
 182  _DType = np.dtype[Any]
 183else:
 184  # Create named types for use in the `pdoc` documentation.
 185  _ArrayLike = TypeVar('_ArrayLike')
 186  _DTypeLike = TypeVar('_DTypeLike')
 187  _NDArray = TypeVar('_NDArray')
 188  _DType = TypeVar('_DType')  # pylint: disable=invalid-name
 189
 190_IPYTHON_HTML_SIZE_LIMIT = 10**10  # Unlimited seems to be OK now.
 191_T = TypeVar('_T')
 192_Path = Union[str, 'os.PathLike[str]']
 193
 194_IMAGE_COMPARISON_HTML = """\
 195<script
 196  defer
 197  src="https://unpkg.com/img-comparison-slider@7/dist/index.js"
 198></script>
 199<link
 200  rel="stylesheet"
 201  href="https://unpkg.com/img-comparison-slider@7/dist/styles.css"
 202/>
 203
 204<img-comparison-slider>
 205  <img slot="first" src="data:image/png;base64,{b64_1}" />
 206  <img slot="second" src="data:image/png;base64,{b64_2}" />
 207</img-comparison-slider>
 208"""
 209
 210# ** Miscellaneous.
 211
 212
 213class _Config:
 214  ffmpeg_name_or_path: _Path = 'ffmpeg'
 215  show_save_dir: _Path | None = None
 216
 217
 218_config = _Config()
 219
 220
 221def _open(path: _Path, *args: Any, **kwargs: Any) -> Any:
 222  """Opens the file; this is a hook for the built-in `open()`."""
 223  return open(path, *args, **kwargs)
 224
 225
 226def _path_is_local(path: _Path) -> bool:
 227  """Returns True if the path is in the filesystem accessible by `ffmpeg`."""
 228  del path
 229  return True
 230
 231
 232def _search_for_ffmpeg_path() -> str | None:
 233  """Returns a path to the ffmpeg program, or None if not found."""
 234  if filename := shutil.which(_config.ffmpeg_name_or_path):
 235    return str(filename)
 236  return None
 237
 238
 239def _print_err(*args: str, **kwargs: Any) -> None:
 240  """Prints arguments to stderr immediately."""
 241  kwargs = {**dict(file=sys.stderr, flush=True), **kwargs}
 242  print(*args, **kwargs)
 243
 244
 245def _chunked(
 246    iterable: Iterable[_T], n: int | None = None
 247) -> Iterator[tuple[_T, ...]]:
 248  """Returns elements collected as tuples of length at most `n` if not None."""
 249
 250  def take(n: int | None, iterable: Iterable[_T]) -> tuple[_T, ...]:
 251    return tuple(itertools.islice(iterable, n))
 252
 253  return iter(functools.partial(take, n, iter(iterable)), ())
 254
 255
 256def _peek_first(iterator: Iterable[_T]) -> tuple[_T, Iterable[_T]]:
 257  """Given an iterator, returns first element and re-initialized iterator.
 258
 259  >>> first_image, images = _peek_first(moving_circle())
 260
 261  Args:
 262    iterator: An input iterator or iterable.
 263
 264  Returns:
 265    A tuple (first_element, iterator_reinitialized) containing:
 266      first_element: The first element of the input.
 267      iterator_reinitialized: A clone of the original iterator/iterable.
 268  """
 269  # Inspired from https://stackoverflow.com/a/12059829/1190077
 270  peeker, iterator_reinitialized = itertools.tee(iterator)
 271  first = next(peeker)
 272  return first, iterator_reinitialized
 273
 274
 275def _check_2d_shape(shape: tuple[int, int]) -> None:
 276  """Checks that `shape` is of the form (height, width) with two integers."""
 277  if len(shape) != 2:
 278    raise ValueError(f'Shape {shape} is not of the form (height, width).')
 279  if not all(isinstance(i, numbers.Integral) for i in shape):
 280    raise ValueError(f'Shape {shape} contains non-integers.')
 281
 282
 283def _run(args: str | Sequence[str]) -> None:
 284  """Executes command, printing output from stdout and stderr.
 285
 286  Args:
 287    args: Command to execute, which can be either a string or a sequence of word
 288      strings, as in `subprocess.run()`.  If `args` is a string, the shell is
 289      invoked to interpret it.
 290
 291  Raises:
 292    RuntimeError: If the command's exit code is nonzero.
 293  """
 294  proc = subprocess.run(
 295      args,
 296      shell=isinstance(args, str),
 297      stdout=subprocess.PIPE,
 298      stderr=subprocess.STDOUT,
 299      check=False,
 300      universal_newlines=True,
 301  )
 302  print(proc.stdout, end='', flush=True)
 303  if proc.returncode:
 304    raise RuntimeError(
 305        f"Command '{proc.args}' failed with code {proc.returncode}."
 306    )
 307
 308
 309def _display_html(text: str, /) -> None:
 310  """In a Jupyter notebook, display the HTML `text`."""
 311  IPython.display.display(IPython.display.HTML(text))  # type: ignore
 312
 313
 314def set_ffmpeg(name_or_path: _Path) -> None:
 315  """Specifies the name or path for the `ffmpeg` external program.
 316
 317  The `ffmpeg` program is required for compressing and decompressing video.
 318  (It is used in `read_video`, `write_video`, `show_video`, `show_videos`,
 319  etc.)
 320
 321  Args:
 322    name_or_path: Either a filename within a directory of `os.environ['PATH']`
 323      or a filepath.  The default setting is 'ffmpeg'.
 324  """
 325  _config.ffmpeg_name_or_path = name_or_path
 326
 327
 328def set_output_height(num_pixels: int) -> None:
 329  """Overrides the height of the current output cell, if using Colab."""
 330  try:
 331    # We want to fail gracefully for non-Colab IPython notebooks.
 332    output = importlib.import_module('google.colab.output')
 333    s = f'google.colab.output.setIframeHeight("{num_pixels}px")'
 334    output.eval_js(s)
 335  except (ModuleNotFoundError, AttributeError):
 336    pass
 337
 338
 339def set_max_output_height(num_pixels: int) -> None:
 340  """Sets the maximum height of the current output cell, if using Colab."""
 341  try:
 342    # We want to fail gracefully for non-Colab IPython notebooks.
 343    output = importlib.import_module('google.colab.output')
 344    s = (
 345        'google.colab.output.setIframeHeight('
 346        f'0, true, {{maxHeight: {num_pixels}}})'
 347    )
 348    output.eval_js(s)
 349  except (ModuleNotFoundError, AttributeError):
 350    pass
 351
 352
 353# ** Type conversions.
 354
 355
 356def _as_valid_media_type(dtype: _DTypeLike) -> _DType:
 357  """Returns validated media data type."""
 358  dtype = np.dtype(dtype)
 359  if not issubclass(dtype.type, (np.unsignedinteger, np.floating)):
 360    raise ValueError(
 361        f'Type {dtype} is not a valid media data type (uint or float).'
 362    )
 363  return dtype
 364
 365
 366def _as_valid_media_array(x: _ArrayLike) -> _NDArray:
 367  """Converts to ndarray (if not already), and checks validity of data type."""
 368  a = np.asarray(x)
 369  if a.dtype == bool:
 370    a = a.astype(np.uint8) * np.iinfo(np.uint8).max
 371  _as_valid_media_type(a.dtype)
 372  return a
 373
 374
 375def to_type(array: _ArrayLike, dtype: _DTypeLike) -> _NDArray:
 376  """Returns media array converted to specified type.
 377
 378  A "media array" is one in which the dtype is either a floating-point type
 379  (np.float32 or np.float64) or an unsigned integer type.  The array values are
 380  assumed to lie in the range [0.0, 1.0] for floating-point values, and in the
 381  full range for unsigned integers, e.g. [0, 255] for np.uint8.
 382
 383  Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to
 384  1.0.  The input array may also be of type bool, whereby True maps to
 385  uint(MAX) or 1.0.  The values are scaled and clamped as appropriate during
 386  type conversions.
 387
 388  Args:
 389    array: Input array-like object (floating-point, unsigned int, or bool).
 390    dtype: Desired output type (floating-point or unsigned int).
 391
 392  Returns:
 393    Array `a` if it is already of the specified dtype, else a converted array.
 394  """
 395  a = np.asarray(array)
 396  dtype = np.dtype(dtype)
 397  del array
 398  if a.dtype != bool:
 399    _as_valid_media_type(a.dtype)  # Verify that 'a' has a valid dtype.
 400  if a.dtype == bool:
 401    result = a.astype(dtype)
 402    if np.issubdtype(dtype, np.unsignedinteger):
 403      result = result * dtype.type(np.iinfo(dtype).max)
 404  elif a.dtype == dtype:
 405    result = a
 406  elif np.issubdtype(dtype, np.unsignedinteger):
 407    if np.issubdtype(a.dtype, np.unsignedinteger):
 408      src_max: float = np.iinfo(a.dtype).max
 409    else:
 410      a = np.clip(a, 0.0, 1.0)
 411      src_max = 1.0
 412    dst_max = np.iinfo(dtype).max
 413    if dst_max <= np.iinfo(np.uint16).max:
 414      scale = np.array(dst_max / src_max, dtype=np.float32)
 415      result = (a * scale + 0.5).astype(dtype)
 416    elif dst_max <= np.iinfo(np.uint32).max:
 417      result = (a.astype(np.float64) * (dst_max / src_max) + 0.5).astype(dtype)
 418    else:
 419      # https://stackoverflow.com/a/66306123/
 420      a = a.astype(np.float64) * (dst_max / src_max) + 0.5
 421      dst = np.atleast_1d(a)
 422      values_too_large = dst >= np.float64(dst_max)
 423      with np.errstate(invalid='ignore'):
 424        dst = dst.astype(dtype)
 425      dst[values_too_large] = dst_max
 426      result = dst if a.ndim > 0 else dst[0]
 427  else:
 428    assert np.issubdtype(dtype, np.floating)
 429    result = a.astype(dtype)
 430    if np.issubdtype(a.dtype, np.unsignedinteger):
 431      result = result / dtype.type(np.iinfo(a.dtype).max)
 432  return result
 433
 434
 435def to_float01(a: _ArrayLike, dtype: _DTypeLike = np.float32) -> _NDArray:
 436  """If array has unsigned integers, rescales them to the range [0.0, 1.0].
 437
 438  Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0.  See
 439  `to_type`.
 440
 441  Args:
 442    a: Input array.
 443    dtype: Desired floating-point type if rescaling occurs.
 444
 445  Returns:
 446    A new array of dtype values in the range [0.0, 1.0] if the input array `a`
 447    contains unsigned integers; otherwise, array `a` is returned unchanged.
 448  """
 449  a = np.asarray(a)
 450  dtype = np.dtype(dtype)
 451  if not np.issubdtype(dtype, np.floating):
 452    raise ValueError(f'Type {dtype} is not floating-point.')
 453  if np.issubdtype(a.dtype, np.floating):
 454    return a
 455  return to_type(a, dtype)
 456
 457
 458def to_uint8(a: _ArrayLike) -> _NDArray:
 459  """Returns array converted to uint8 values; see `to_type`."""
 460  return to_type(a, np.uint8)
 461
 462
 463# ** Functions to generate example image and video data.
 464
 465
 466def color_ramp(
 467    shape: tuple[int, int] = (64, 64), *, dtype: _DTypeLike = np.float32
 468) -> _NDArray:
 469  """Returns an image of a red-green color gradient.
 470
 471  This is useful for quick experimentation and testing.  See also
 472  `moving_circle` to generate a sample video.
 473
 474  Args:
 475    shape: 2D spatial dimensions (height, width) of generated image.
 476    dtype: Type (uint or floating) of resulting pixel values.
 477  """
 478  _check_2d_shape(shape)
 479  dtype = _as_valid_media_type(dtype)
 480  yx = (np.moveaxis(np.indices(shape), 0, -1) + 0.5) / shape
 481  image = np.insert(yx, 2, 0.0, axis=-1)
 482  return to_type(image, dtype)
 483
 484
 485def moving_circle(
 486    shape: tuple[int, int] = (256, 256),
 487    num_images: int = 10,
 488    *,
 489    dtype: _DTypeLike = np.float32,
 490) -> _NDArray:
 491  """Returns a video of a circle moving in front of a color ramp.
 492
 493  This is useful for quick experimentation and testing.  See also `color_ramp`
 494  to generate a sample image.
 495
 496  >>> show_video(moving_circle((480, 640), 60), fps=60)
 497
 498  Args:
 499    shape: 2D spatial dimensions (height, width) of generated video.
 500    num_images: Number of video frames.
 501    dtype: Type (uint or floating) of resulting pixel values.
 502  """
 503  _check_2d_shape(shape)
 504  dtype = np.dtype(dtype)
 505
 506  def generate_image(image_index: int) -> _NDArray:
 507    """Returns a video frame image."""
 508    image = color_ramp(shape, dtype=dtype)
 509    yx = np.moveaxis(np.indices(shape), 0, -1)
 510    center = shape[0] * 0.6, shape[1] * (image_index + 0.5) / num_images
 511    radius_squared = (min(shape) * 0.1) ** 2
 512    inside = np.sum((yx - center) ** 2, axis=-1) < radius_squared
 513    white_circle_color = 1.0, 1.0, 1.0
 514    if np.issubdtype(dtype, np.unsignedinteger):
 515      white_circle_color = to_type([white_circle_color], dtype)[0]
 516    image[inside] = white_circle_color
 517    return image
 518
 519  return np.array([generate_image(i) for i in range(num_images)])
 520
 521
 522# ** Color-space conversions.
 523
 524# Same matrix values as in two sources:
 525# https://github.com/scikit-image/scikit-image/blob/master/skimage/color/colorconv.py#L377
 526# https://github.com/tensorflow/tensorflow/blob/r1.14/tensorflow/python/ops/image_ops_impl.py#L2754
 527_YUV_FROM_RGB_MATRIX = np.array(
 528    [
 529        [0.299, -0.14714119, 0.61497538],
 530        [0.587, -0.28886916, -0.51496512],
 531        [0.114, 0.43601035, -0.10001026],
 532    ],
 533    dtype=np.float32,
 534)
 535_RGB_FROM_YUV_MATRIX = np.linalg.inv(_YUV_FROM_RGB_MATRIX)
 536_YUV_CHROMA_OFFSET = np.array([0.0, 0.5, 0.5], dtype=np.float32)
 537
 538
 539def yuv_from_rgb(rgb: _ArrayLike) -> _NDArray:
 540  """Returns the RGB image/video mapped to YUV [0,1] color space.
 541
 542  Note that the "YUV" color space used by video compressors is actually YCbCr!
 543
 544  Args:
 545    rgb: Input image in sRGB space.
 546  """
 547  rgb = to_float01(rgb)
 548  if rgb.shape[-1] != 3:
 549    raise ValueError(f'The last dimension in {rgb.shape} is not 3.')
 550  return rgb @ _YUV_FROM_RGB_MATRIX + _YUV_CHROMA_OFFSET
 551
 552
 553def rgb_from_yuv(yuv: _ArrayLike) -> _NDArray:
 554  """Returns the YUV image/video mapped to RGB [0,1] color space."""
 555  yuv = to_float01(yuv)
 556  if yuv.shape[-1] != 3:
 557    raise ValueError(f'The last dimension in {yuv.shape} is not 3.')
 558  return (yuv - _YUV_CHROMA_OFFSET) @ _RGB_FROM_YUV_MATRIX
 559
 560
 561# Same matrix values as in
 562# https://github.com/scikit-image/scikit-image/blob/master/skimage/color/colorconv.py#L1654
 563# and https://en.wikipedia.org/wiki/YUV#Studio_swing_for_BT.601
 564_YCBCR_FROM_RGB_MATRIX = np.array(
 565    [
 566        [65.481, 128.553, 24.966],
 567        [-37.797, -74.203, 112.0],
 568        [112.0, -93.786, -18.214],
 569    ],
 570    dtype=np.float32,
 571).transpose()
 572_RGB_FROM_YCBCR_MATRIX = np.linalg.inv(_YCBCR_FROM_RGB_MATRIX)
 573_YCBCR_OFFSET = np.array([16.0, 128.0, 128.0], dtype=np.float32)
 574# Note that _YCBCR_FROM_RGB_MATRIX =~ _YUV_FROM_RGB_MATRIX * [219, 256, 182];
 575# https://en.wikipedia.org/wiki/YUV: "Y' values are conventionally shifted and
 576# scaled to the range [16, 235] (referred to as studio swing or 'TV levels')";
 577# "studio range of 16-240 for U and V".  (Where does value 182 come from?)
 578
 579
 580def ycbcr_from_rgb(rgb: _ArrayLike) -> _NDArray:
 581  """Returns the RGB image/video mapped to YCbCr [0,1] color space.
 582
 583  The YCbCr color space is the one called "YUV" by video compressors.
 584
 585  Args:
 586    rgb: Input image in sRGB space.
 587  """
 588  rgb = to_float01(rgb)
 589  if rgb.shape[-1] != 3:
 590    raise ValueError(f'The last dimension in {rgb.shape} is not 3.')
 591  return (rgb @ _YCBCR_FROM_RGB_MATRIX + _YCBCR_OFFSET) / 255.0
 592
 593
 594def rgb_from_ycbcr(ycbcr: _ArrayLike) -> _NDArray:
 595  """Returns the YCbCr image/video mapped to RGB [0,1] color space."""
 596  ycbcr = to_float01(ycbcr)
 597  if ycbcr.shape[-1] != 3:
 598    raise ValueError(f'The last dimension in {ycbcr.shape} is not 3.')
 599  return (ycbcr * 255.0 - _YCBCR_OFFSET) @ _RGB_FROM_YCBCR_MATRIX
 600
 601
 602# ** Image processing.
 603
 604
 605def _pil_image(image: _ArrayLike, mode: str | None = None) -> PIL.Image.Image:
 606  """Returns a PIL image given a numpy matrix (either uint8 or float [0,1])."""
 607  image = _as_valid_media_array(image)
 608  if image.ndim not in (2, 3):
 609    raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.')
 610  pil_image: PIL.Image.Image = PIL.Image.fromarray(image, mode=mode)
 611  return pil_image
 612
 613
 614def resize_image(image: _ArrayLike, shape: tuple[int, int]) -> _NDArray:
 615  """Resizes image to specified spatial dimensions using a Lanczos filter.
 616
 617  Args:
 618    image: Array-like 2D or 3D object, where dtype is uint or floating-point.
 619    shape: 2D spatial dimensions (height, width) of output image.
 620
 621  Returns:
 622    A resampled image whose spatial dimensions match `shape`.
 623  """
 624  image = _as_valid_media_array(image)
 625  if image.ndim not in (2, 3):
 626    raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.')
 627  _check_2d_shape(shape)
 628
 629  # A PIL image can be multichannel only if it has 3 or 4 uint8 channels,
 630  # and it can be resized only if it is uint8 or float32.
 631  supported_single_channel = (
 632      np.issubdtype(image.dtype, np.floating) or image.dtype == np.uint8
 633  ) and image.ndim == 2
 634  supported_multichannel = (
 635      image.dtype == np.uint8 and image.ndim == 3 and image.shape[2] in (3, 4)
 636  )
 637  if supported_single_channel or supported_multichannel:
 638    return np.array(
 639        _pil_image(image).resize(
 640            shape[::-1], resample=PIL.Image.Resampling.LANCZOS
 641        ),
 642        dtype=image.dtype,
 643    )
 644  if image.ndim == 2:
 645    # We convert to floating-point for resizing and convert back.
 646    return to_type(resize_image(to_float01(image), shape), image.dtype)
 647  # We resize each image channel individually.
 648  return np.dstack(
 649      [resize_image(channel, shape) for channel in np.moveaxis(image, -1, 0)]
 650  )
 651
 652
 653# ** Video processing.
 654
 655
 656def resize_video(video: Iterable[_NDArray], shape: tuple[int, int]) -> _NDArray:
 657  """Resizes `video` to specified spatial dimensions using a Lanczos filter.
 658
 659  Args:
 660    video: Iterable of images.
 661    shape: 2D spatial dimensions (height, width) of output video.
 662
 663  Returns:
 664    A resampled video whose spatial dimensions match `shape`.
 665  """
 666  _check_2d_shape(shape)
 667  return np.array([resize_image(image, shape) for image in video])
 668
 669
 670# ** General I/O.
 671
 672
 673def _is_url(path_or_url: _Path) -> bool:
 674  return isinstance(path_or_url, str) and path_or_url.startswith(
 675      ('http://', 'https://', 'file://')
 676  )
 677
 678
 679def read_contents(path_or_url: _Path) -> bytes:
 680  """Returns the contents of the file specified by either a path or URL."""
 681  data: bytes
 682  if _is_url(path_or_url):
 683    assert isinstance(path_or_url, str)
 684    headers = {'User-Agent': 'Chrome'}
 685    request = urllib.request.Request(path_or_url, headers=headers)
 686    with urllib.request.urlopen(request) as response:
 687      data = response.read()
 688  else:
 689    with _open(path_or_url, 'rb') as f:
 690      data = f.read()
 691  return data
 692
 693
 694@contextlib.contextmanager
 695def _read_via_local_file(path_or_url: _Path) -> Iterator[str]:
 696  """Context to copy a remote file locally to read from it.
 697
 698  Args:
 699    path_or_url: File, which may be remote.
 700
 701  Yields:
 702    The name of a local file which may be a copy of a remote file.
 703  """
 704  if _is_url(path_or_url) or not _path_is_local(path_or_url):
 705    suffix = pathlib.Path(path_or_url).suffix
 706    with tempfile.TemporaryDirectory() as directory_name:
 707      tmp_path = pathlib.Path(directory_name) / f'file{suffix}'
 708      tmp_path.write_bytes(read_contents(path_or_url))
 709      yield str(tmp_path)
 710  else:
 711    yield str(path_or_url)
 712
 713
 714@contextlib.contextmanager
 715def _write_via_local_file(path: _Path) -> Iterator[str]:
 716  """Context to write a temporary local file and subsequently copy it remotely.
 717
 718  Args:
 719    path: File, which may be remote.
 720
 721  Yields:
 722    The name of a local file which may be subsequently copied remotely.
 723  """
 724  if _path_is_local(path):
 725    yield str(path)
 726  else:
 727    suffix = pathlib.Path(path).suffix
 728    with tempfile.TemporaryDirectory() as directory_name:
 729      tmp_path = pathlib.Path(directory_name) / f'file{suffix}'
 730      yield str(tmp_path)
 731      with _open(path, mode='wb') as f:
 732        f.write(tmp_path.read_bytes())
 733
 734
 735class set_show_save_dir:  # pylint: disable=invalid-name
 736  """Save all titled output from `show_*()` calls into files.
 737
 738  If the specified `directory` is not None, all titled images and videos
 739  displayed by `show_image`, `show_images`, `show_video`, and `show_videos` are
 740  also saved as files within the directory.
 741
 742  It can be used either to set the state or as a context manager:
 743
 744  >>> set_show_save_dir('/tmp')
 745  >>> show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
 746  >>> show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
 747  >>> set_show_save_dir(None)
 748
 749  >>> with set_show_save_dir('/tmp'):
 750  ...   show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
 751  ...   show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
 752  """
 753
 754  def __init__(self, directory: _Path | None):
 755    self._old_show_save_dir = _config.show_save_dir
 756    _config.show_save_dir = directory
 757
 758  def __enter__(self) -> None:
 759    pass
 760
 761  def __exit__(self, *_: Any) -> None:
 762    _config.show_save_dir = self._old_show_save_dir
 763
 764
 765# ** Image I/O.
 766
 767
 768def read_image(
 769    path_or_url: _Path,
 770    *,
 771    apply_exif_transpose: bool = True,
 772    dtype: _DTypeLike = None,
 773) -> _NDArray:
 774  """Returns an image read from a file path or URL.
 775
 776  Decoding is performed using `PIL`, which supports `uint8` images with 1, 3,
 777  or 4 channels and `uint16` images with a single channel.
 778
 779  Args:
 780    path_or_url: Path of input file.
 781    apply_exif_transpose: If True, rotate image according to EXIF orientation.
 782    dtype: Data type of the returned array.  If None, `np.uint8` or `np.uint16`
 783      is inferred automatically.
 784  """
 785  data = read_contents(path_or_url)
 786  return decompress_image(data, dtype, apply_exif_transpose)
 787
 788
 789def write_image(
 790    path: _Path, image: _ArrayLike, fmt: str = 'png', **kwargs: Any
 791) -> None:
 792  """Writes an image to a file.
 793
 794  Encoding is performed using `PIL`, which supports `uint8` images with 1, 3,
 795  or 4 channels and `uint16` images with a single channel.
 796
 797  File format is explicitly provided by `fmt` and not inferred by `path`.
 798
 799  Args:
 800    path: Path of output file.
 801    image: Array-like object.  If its type is float, it is converted to np.uint8
 802      using `to_uint8` (thus clamping to the input to the range [0.0, 1.0]).
 803      Otherwise it must be np.uint8 or np.uint16.
 804    fmt: Desired compression encoding, e.g. 'png'.
 805    **kwargs: Additional parameters for `PIL.Image.save()`.
 806  """
 807  image = _as_valid_media_array(image)
 808  if np.issubdtype(image.dtype, np.floating):
 809    image = to_uint8(image)
 810  with _open(path, 'wb') as f:
 811    _pil_image(image).save(f, format=fmt, **kwargs)
 812
 813
 814def to_rgb(
 815    array: _ArrayLike,
 816    *,
 817    vmin: float | None = None,
 818    vmax: float | None = None,
 819    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
 820) -> _NDArray:
 821  """Maps scalar values to RGB using value bounds and a color map.
 822
 823  Args:
 824    array: Scalar values, with arbitrary shape.
 825    vmin: Explicit min value for remapping; if None, it is obtained as the
 826      minimum finite value of `array`.
 827    vmax: Explicit max value for remapping; if None, it is obtained as the
 828      maximum finite value of `array`.
 829    cmap: A `pyplot` color map or callable, to map from 1D value to 3D or 4D
 830      color.
 831
 832  Returns:
 833    A new array in which each element is affinely mapped from [vmin, vmax]
 834    to [0.0, 1.0] and then color-mapped.
 835  """
 836  a = _as_valid_media_array(array)
 837  del array
 838  # For future numpy version 1.7.0:
 839  # vmin = np.amin(a, where=np.isfinite(a)) if vmin is None else vmin
 840  # vmax = np.amax(a, where=np.isfinite(a)) if vmax is None else vmax
 841  vmin = np.amin(np.where(np.isfinite(a), a, np.inf)) if vmin is None else vmin
 842  vmax = np.amax(np.where(np.isfinite(a), a, -np.inf)) if vmax is None else vmax
 843  a = (a.astype('float') - vmin) / (vmax - vmin + np.finfo(float).eps)
 844  if isinstance(cmap, str):
 845    if hasattr(matplotlib, 'colormaps'):
 846      rgb_from_scalar: Any = matplotlib.colormaps[cmap]  # Newer version.
 847    else:
 848      rgb_from_scalar = matplotlib.pyplot.cm.get_cmap(cmap)  # pylint: disable=no-member
 849  else:
 850    rgb_from_scalar = cmap
 851  a = cast(_NDArray, rgb_from_scalar(a))
 852  # If there is a fully opaque alpha channel, remove it.
 853  if a.shape[-1] == 4 and np.all(to_float01(a[..., 3])) == 1.0:
 854    a = a[..., :3]
 855  return a
 856
 857
 858def compress_image(
 859    image: _ArrayLike, *, fmt: str = 'png', **kwargs: Any
 860) -> bytes:
 861  """Returns a buffer containing a compressed image.
 862
 863  Args:
 864    image: Array in a format supported by `PIL`, e.g. np.uint8 or np.uint16.
 865    fmt: Desired compression encoding, e.g. 'png'.
 866    **kwargs: Options for `PIL.save()`, e.g. `optimize=True` for greater
 867      compression.
 868  """
 869  image = _as_valid_media_array(image)
 870  with io.BytesIO() as output:
 871    _pil_image(image).save(output, format=fmt, **kwargs)
 872    return output.getvalue()
 873
 874
 875def decompress_image(
 876    data: bytes, dtype: _DTypeLike = None, apply_exif_transpose: bool = True
 877) -> _NDArray:
 878  """Returns an image from a compressed data buffer.
 879
 880  Decoding is performed using `PIL`, which supports `uint8` images with 1, 3,
 881  or 4 channels and `uint16` images with a single channel.
 882
 883  Args:
 884    data: Buffer containing compressed image.
 885    dtype: Data type of the returned array.  If None, `np.uint8` or `np.uint16`
 886      is inferred automatically.
 887    apply_exif_transpose: If True, rotate image according to EXIF orientation.
 888  """
 889  pil_image: PIL.Image.Image = PIL.Image.open(io.BytesIO(data))
 890  if apply_exif_transpose:
 891    tmp_image = PIL.ImageOps.exif_transpose(pil_image)  # Future: in_place=True.
 892    assert tmp_image
 893    pil_image = tmp_image
 894  if dtype is None:
 895    dtype = np.uint16 if pil_image.mode.startswith('I') else np.uint8
 896  return np.array(pil_image, dtype=dtype)
 897
 898
 899def html_from_compressed_image(
 900    data: bytes,
 901    width: int,
 902    height: int,
 903    *,
 904    title: str | None = None,
 905    border: bool | str = False,
 906    pixelated: bool = True,
 907    fmt: str = 'png',
 908) -> str:
 909  """Returns an HTML string with an image tag containing encoded data.
 910
 911  Args:
 912    data: Compressed image bytes.
 913    width: Width of HTML image in pixels.
 914    height: Height of HTML image in pixels.
 915    title: Optional text shown centered above image.
 916    border: If `bool`, whether to place a black boundary around the image, or if
 917      `str`, the boundary CSS style.
 918    pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'.
 919    fmt: Compression encoding.
 920  """
 921  b64 = base64.b64encode(data).decode('utf-8')
 922  if isinstance(border, str):
 923    border = f'{border}; '
 924  elif border:
 925    border = 'border:1px solid black; '
 926  else:
 927    border = ''
 928  s_pixelated = 'pixelated' if pixelated else 'auto'
 929  s = (
 930      f'<img width="{width}" height="{height}"'
 931      f' style="{border}image-rendering:{s_pixelated}; object-fit:cover;"'
 932      f' src="data:image/{fmt};base64,{b64}"/>'
 933  )
 934  if title is not None:
 935    s = f"""<div style="display:flex; align-items:left;">
 936      <div style="display:flex; flex-direction:column; align-items:center;">
 937      <div>{title}</div><div>{s}</div></div></div>"""
 938  return s
 939
 940
 941def _get_width_height(
 942    width: int | None, height: int | None, shape: tuple[int, int]
 943) -> tuple[int, int]:
 944  """Returns (width, height) given optional parameters and image shape."""
 945  assert len(shape) == 2, shape
 946  if width and height:
 947    return width, height
 948  if width and not height:
 949    return width, int(width * (shape[0] / shape[1]) + 0.5)
 950  if height and not width:
 951    return int(height * (shape[1] / shape[0]) + 0.5), height
 952  return shape[::-1]
 953
 954
 955def _ensure_mapped_to_rgb(
 956    image: _ArrayLike,
 957    *,
 958    vmin: float | None = None,
 959    vmax: float | None = None,
 960    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
 961) -> _NDArray:
 962  """Ensure image is mapped to RGB."""
 963  image = _as_valid_media_array(image)
 964  if not (image.ndim == 2 or (image.ndim == 3 and image.shape[2] in (1, 3, 4))):
 965    raise ValueError(
 966        f'Image with shape {image.shape} is neither a 2D array'
 967        ' nor a 3D array with 1, 3, or 4 channels.'
 968    )
 969  if image.ndim == 3 and image.shape[2] == 1:
 970    image = image[:, :, 0]
 971  if image.ndim == 2:
 972    image = to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
 973  return image
 974
 975
 976def show_image(
 977    image: _ArrayLike, *, title: str | None = None, **kwargs: Any
 978) -> str | None:
 979  """Displays an image in the notebook and optionally saves it to a file.
 980
 981  See `show_images`.
 982
 983  >>> show_image(np.random.rand(100, 100))
 984  >>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8'))
 985  >>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100)
 986  >>> show_image(read_image('/tmp/image.png'))
 987  >>> url = 'https://github.com/hhoppe/data/raw/main/image.png'
 988  >>> show_image(read_image(url))
 989
 990  Args:
 991    image: 2D array-like, or 3D array-like with 1, 3, or 4 channels.
 992    title: Optional text shown centered above the image.
 993    **kwargs: See `show_images`.
 994
 995  Returns:
 996    html string if `return_html` is `True`.
 997  """
 998  return show_images([np.asarray(image)], [title], **kwargs)
 999
1000
1001def show_images(
1002    images: Iterable[_ArrayLike] | Mapping[str, _ArrayLike],
1003    titles: Iterable[str | None] | None = None,
1004    *,
1005    width: int | None = None,
1006    height: int | None = None,
1007    downsample: bool = True,
1008    columns: int | None = None,
1009    vmin: float | None = None,
1010    vmax: float | None = None,
1011    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
1012    border: bool | str = False,
1013    ylabel: str = '',
1014    html_class: str = 'show_images',
1015    pixelated: bool | None = None,
1016    return_html: bool = False,
1017) -> str | None:
1018  """Displays a row of images in the IPython/Jupyter notebook.
1019
1020  If a directory has been specified using `set_show_save_dir`, also saves each
1021  titled image to a file in that directory based on its title.
1022
1023  >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
1024  >>> show_images([image1, image2])
1025  >>> show_images({'random image': image1, 'color ramp': image2}, height=128)
1026  >>> show_images([image1, image2] * 5, columns=4, border=True)
1027
1028  Args:
1029    images: Iterable of images, or dictionary of `{title: image}`.  Each image
1030      must be either a 2D array or a 3D array with 1, 3, or 4 channels.
1031    titles: Optional strings shown above the corresponding images.
1032    width: Optional, overrides displayed width (in pixels).
1033    height: Optional, overrides displayed height (in pixels).
1034    downsample: If True, each image whose width or height is greater than the
1035      specified `width` or `height` is resampled to the display resolution. This
1036      improves antialiasing and reduces the size of the notebook.
1037    columns: Optional, maximum number of images per row.
1038    vmin: For single-channel image, explicit min value for display.
1039    vmax: For single-channel image, explicit max value for display.
1040    cmap: For single-channel image, `pyplot` color map or callable to map 1D to
1041      3D color.
1042    border: If `bool`, whether to place a black boundary around the image, or if
1043      `str`, the boundary CSS style.
1044    ylabel: Text (rotated by 90 degrees) shown on the left of each row.
1045    html_class: CSS class name used in definition of HTML element.
1046    pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if
1047      False, sets 'image-rendering: auto'; if None, uses pixelated rendering
1048      only on images for which `width` or `height` introduces magnification.
1049    return_html: If `True` return the raw HTML `str` instead of displaying.
1050
1051  Returns:
1052    html string if `return_html` is `True`.
1053  """
1054  if isinstance(images, Mapping):
1055    if titles is not None:
1056      raise ValueError('Cannot have images dictionary and titles parameter.')
1057    list_titles, list_images = list(images.keys()), list(images.values())
1058  else:
1059    list_images = list(images)
1060    list_titles = [None] * len(list_images) if titles is None else list(titles)
1061    if len(list_images) != len(list_titles):
1062      raise ValueError(
1063          'Number of images does not match number of titles'
1064          f' ({len(list_images)} vs {len(list_titles)}).'
1065      )
1066
1067  list_images = [
1068      _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
1069      for image in list_images
1070  ]
1071
1072  def maybe_downsample(image: _NDArray) -> _NDArray:
1073    shape = image.shape[0], image.shape[1]
1074    w, h = _get_width_height(width, height, shape)
1075    if w < shape[1] or h < shape[0]:
1076      image = resize_image(image, (h, w))
1077    return image
1078
1079  if downsample:
1080    list_images = [maybe_downsample(image) for image in list_images]
1081  png_datas = [compress_image(to_uint8(image)) for image in list_images]
1082
1083  for title, png_data in zip(list_titles, png_datas):
1084    if title is not None and _config.show_save_dir:
1085      path = pathlib.Path(_config.show_save_dir) / f'{title}.png'
1086      with _open(path, mode='wb') as f:
1087        f.write(png_data)
1088
1089  def html_from_compressed_images() -> str:
1090    html_strings = []
1091    for image, title, png_data in zip(list_images, list_titles, png_datas):
1092      w, h = _get_width_height(width, height, image.shape[:2])
1093      magnified = h > image.shape[0] or w > image.shape[1]
1094      pixelated2 = pixelated if pixelated is not None else magnified
1095      html_strings.append(
1096          html_from_compressed_image(
1097              png_data, w, h, title=title, border=border, pixelated=pixelated2
1098          )
1099      )
1100    # Create single-row tables each with no more than 'columns' elements.
1101    table_strings = []
1102    for row_html_strings in _chunked(html_strings, columns):
1103      td = '<td style="padding:1px;">'
1104      s = ''.join(f'{td}{e}</td>' for e in row_html_strings)
1105      if ylabel:
1106        style = 'writing-mode:vertical-lr; transform:rotate(180deg);'
1107        s = f'{td}<span style="{style}">{ylabel}</span></td>' + s
1108      table_strings.append(
1109          f'<table class="{html_class}"'
1110          f' style="border-spacing:0px;"><tr>{s}</tr></table>'
1111      )
1112    return ''.join(table_strings)
1113
1114  s = html_from_compressed_images()
1115  while len(s) > _IPYTHON_HTML_SIZE_LIMIT * 0.5:
1116    warnings.warn('mediapy: subsampling images to reduce HTML size')
1117    list_images = [image[::2, ::2] for image in list_images]
1118    png_datas = [compress_image(to_uint8(image)) for image in list_images]
1119    s = html_from_compressed_images()
1120  if return_html:
1121    return s
1122  _display_html(s)
1123  return None
1124
1125
1126def compare_images(
1127    images: Iterable[_ArrayLike],
1128    *,
1129    vmin: float | None = None,
1130    vmax: float | None = None,
1131    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
1132) -> None:
1133  """Compare two images using an interactive slider.
1134
1135  Displays an HTML slider component to interactively swipe between two images.
1136  The slider functionality requires that the web browser have Internet access.
1137  See additional info in `https://github.com/sneas/img-comparison-slider`.
1138
1139  >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
1140  >>> compare_images([image1, image2])
1141
1142  Args:
1143    images: Iterable of images.  Each image must be either a 2D array or a 3D
1144      array with 1, 3, or 4 channels.  There must be exactly two images.
1145    vmin: For single-channel image, explicit min value for display.
1146    vmax: For single-channel image, explicit max value for display.
1147    cmap: For single-channel image, `pyplot` color map or callable to map 1D to
1148      3D color.
1149  """
1150  list_images = [
1151      _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
1152      for image in images
1153  ]
1154  if len(list_images) != 2:
1155    raise ValueError('The number of images must be 2.')
1156  png_datas = [compress_image(to_uint8(image)) for image in list_images]
1157  b64_1, b64_2 = [
1158      base64.b64encode(png_data).decode('utf-8') for png_data in png_datas
1159  ]
1160  s = _IMAGE_COMPARISON_HTML.replace('{b64_1}', b64_1).replace('{b64_2}', b64_2)
1161  _display_html(s)
1162
1163
1164# ** Video I/O.
1165
1166
1167def _filename_suffix_from_codec(codec: str) -> str:
1168  if codec == 'gif':
1169    return '.gif'
1170  if codec == 'vp9':
1171    return '.webm'
1172
1173  return '.mp4'
1174
1175
1176def _get_ffmpeg_path() -> str:
1177  path = _search_for_ffmpeg_path()
1178  if not path:
1179    raise RuntimeError(
1180        f"Program '{_config.ffmpeg_name_or_path}' is not found;"
1181        " perhaps install ffmpeg using 'apt install ffmpeg'."
1182    )
1183  return path
1184
1185
1186@typing.overload
1187def _run_ffmpeg(
1188    ffmpeg_args: Sequence[str],
1189    stdin: int | None = None,
1190    stdout: int | None = None,
1191    stderr: int | None = None,
1192    encoding: None = None,  # No encoding -> bytes
1193    allowed_input_files: Sequence[str] | None = None,
1194    allowed_output_files: Sequence[str] | None = None,
1195) -> subprocess.Popen[bytes]:
1196  ...
1197
1198
1199@typing.overload
1200def _run_ffmpeg(
1201    ffmpeg_args: Sequence[str],
1202    stdin: int | None = None,
1203    stdout: int | None = None,
1204    stderr: int | None = None,
1205    encoding: str = ...,  # Encoding -> str
1206    allowed_input_files: Sequence[str] | None = None,
1207    allowed_output_files: Sequence[str] | None = None,
1208) -> subprocess.Popen[str]:
1209  ...
1210
1211
1212def _run_ffmpeg(
1213    ffmpeg_args: Sequence[str],
1214    stdin: int | None = None,
1215    stdout: int | None = None,
1216    stderr: int | None = None,
1217    encoding: str | None = None,
1218    allowed_input_files: Sequence[str] | None = None,
1219    allowed_output_files: Sequence[str] | None = None,
1220) -> subprocess.Popen[bytes] | subprocess.Popen[str]:
1221  """Runs ffmpeg with the given args.
1222
1223  Args:
1224    ffmpeg_args: The args to pass to ffmpeg.
1225    stdin: Same as in `subprocess.Popen`.
1226    stdout: Same as in `subprocess.Popen`.
1227    stderr: Same as in `subprocess.Popen`.
1228    encoding: Same as in `subprocess.Popen`.
1229    allowed_input_files: The input files to allow for ffmpeg.
1230    allowed_output_files: The output files to allow for ffmpeg.
1231
1232  Returns:
1233    The subprocess.Popen object with running ffmpeg process.
1234  """
1235  argv = []
1236  # In open source, keep env=None to preserve default behavior.
1237  # Context: https://github.com/google/mediapy/pull/62
1238  env: Any = None  # pylint: disable=unused-variable
1239  ffmpeg_path = _get_ffmpeg_path()
1240
1241  # Allowed input and output files are not supported in open source.
1242  del allowed_input_files
1243  del allowed_output_files
1244
1245  argv.append(ffmpeg_path)
1246  argv.extend(ffmpeg_args)
1247
1248  return subprocess.Popen(
1249      argv,
1250      stdin=stdin,
1251      stdout=stdout,
1252      stderr=stderr,
1253      encoding=encoding,
1254      env=env,
1255  )
1256
1257
1258def video_is_available() -> bool:
1259  """Returns True if the program `ffmpeg` is found.
1260
1261  See also `set_ffmpeg`.
1262  """
1263  return _search_for_ffmpeg_path() is not None
1264
1265
1266class VideoMetadata(NamedTuple):
1267  """Represents the data stored in a video container header.
1268
1269  Attributes:
1270    num_images: Number of frames that is expected from the video stream.  This
1271      is estimated from the framerate and the duration stored in the video
1272      header, so it might be inexact.  We set the value to -1 if number of
1273      frames is not found in the header.
1274    shape: The dimensions (height, width) of each video frame.
1275    fps: The framerate in frames per second.
1276    bps: The estimated bitrate of the video stream in bits per second, retrieved
1277      from the video header.
1278  """
1279
1280  num_images: int
1281  shape: tuple[int, int]
1282  fps: float
1283  bps: int | None
1284
1285
1286def _get_video_metadata(path: _Path) -> VideoMetadata:
1287  """Returns attributes of video stored in the specified local file."""
1288  if not pathlib.Path(path).is_file():
1289    raise RuntimeError(f"Video file '{path}' is not found.")
1290
1291  command = [
1292      '-nostdin',
1293      '-i',
1294      str(path),
1295      '-acodec',
1296      'copy',
1297      # Necessary to get "frame= *(\d+)" using newer ffmpeg versions.
1298      # Previously, was `'-vcodec', 'copy'`
1299      '-vf',
1300      'select=1',
1301      '-vsync',
1302      '0',
1303      '-f',
1304      'null',
1305      '-',
1306  ]
1307  with _run_ffmpeg(
1308      command,
1309      allowed_input_files=[str(path)],
1310      stderr=subprocess.PIPE,
1311      encoding='utf-8',
1312  ) as proc:
1313    _, err = proc.communicate()
1314  bps = fps = num_images = width = height = rotation = None
1315  before_output_info = True
1316  for line in err.split('\n'):
1317    if line.startswith('Output '):
1318      before_output_info = False
1319    if match := re.search(r', bitrate: *([\d.]+) kb/s', line):
1320      bps = int(match.group(1)) * 1000
1321    if matches := re.findall(r'frame= *(\d+) ', line):
1322      num_images = int(matches[-1])
1323    if 'Stream #0:' in line and ': Video:' in line and before_output_info:
1324      if not (match := re.search(r', (\d+)x(\d+)', line)):
1325        raise RuntimeError(f'Unable to parse video dimensions in line {line}')
1326      width, height = int(match.group(1)), int(match.group(2))
1327      if match := re.search(r', ([\d.]+) fps', line):
1328        fps = float(match.group(1))
1329      elif str(path).endswith('.gif'):
1330        # Some GIF files lack a framerate attribute; use a reasonable default.
1331        fps = 10
1332      else:
1333        raise RuntimeError(f'Unable to parse video framerate in line {line}')
1334    if match := re.fullmatch(r'\s*rotate\s*:\s*(\d+)', line):
1335      rotation = int(match.group(1))
1336    if match := re.fullmatch(r'.*rotation of (-?\d+).*\sdegrees', line):
1337      rotation = int(match.group(1))
1338  if not num_images:
1339    num_images = -1
1340  if not width:
1341    raise RuntimeError(f'Unable to parse video header: {err}')
1342  # By default, ffmpeg enables "-autorotate"; we just fix the dimensions.
1343  if rotation in (90, 270, -90, -270):
1344    width, height = height, width
1345  assert height is not None and width is not None
1346  shape = height, width
1347  assert fps is not None
1348  return VideoMetadata(num_images, shape, fps, bps)
1349
1350
1351class _VideoIO:
1352  """Base class for `VideoReader` and `VideoWriter`."""
1353
1354  def _get_pix_fmt(self, dtype: _DType, image_format: str) -> str:
1355    """Returns ffmpeg pix_fmt given data type and image format."""
1356    native_endian_suffix = {'little': 'le', 'big': 'be'}[sys.byteorder]
1357    return {
1358        np.uint8: {
1359            'rgb': 'rgb24',
1360            'yuv': 'yuv444p',
1361            'gray': 'gray',
1362        },
1363        np.uint16: {
1364            'rgb': 'rgb48' + native_endian_suffix,
1365            'yuv': 'yuv444p16' + native_endian_suffix,
1366            'gray': 'gray16' + native_endian_suffix,
1367        },
1368    }[dtype.type][image_format]
1369
1370
1371class VideoReader(_VideoIO):
1372  """Context to read a compressed video as an iterable over its images.
1373
1374  >>> with VideoReader('/tmp/river.mp4') as reader:
1375  ...   print(f'Video has {reader.num_images} images with shape={reader.shape},'
1376  ...         f' at {reader.fps} frames/sec and {reader.bps} bits/sec.')
1377  ...   for image in reader:
1378  ...     print(image.shape)
1379
1380  >>> with VideoReader('/tmp/river.mp4') as reader:
1381  ...   video = np.array(tuple(reader))
1382
1383  >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
1384  >>> with VideoReader(url) as reader:
1385  ...   show_video(reader)
1386
1387  Attributes:
1388    path_or_url: Location of input video.
1389    output_format: Format of output images (default 'rgb').  If 'rgb', each
1390      image has shape=(height, width, 3) with R, G, B values.  If 'yuv', each
1391      image has shape=(height, width, 3) with Y, U, V values.  If 'gray', each
1392      image has shape=(height, width).
1393    dtype: Data type for output images.  The default is `np.uint8`.  Use of
1394      `np.uint16` allows reading 10-bit or 12-bit data without precision loss.
1395    metadata: Object storing the information retrieved from the video header.
1396      Its attributes are copied as attributes in this class.
1397    num_images: Number of frames that is expected from the video stream.  This
1398      is estimated from the framerate and the duration stored in the video
1399      header, so it might be inexact.
1400    shape: The dimensions (height, width) of each video frame.
1401    fps: The framerate in frames per second.
1402    bps: The estimated bitrate of the video stream in bits per second, retrieved
1403      from the video header.
1404    stream_index: The stream index to read from. The default is 0.
1405  """
1406
1407  path_or_url: _Path
1408  output_format: str
1409  dtype: _DType
1410  metadata: VideoMetadata
1411  num_images: int
1412  shape: tuple[int, int]
1413  fps: float
1414  bps: int | None
1415  stream_index: int
1416  _num_bytes_per_image: int
1417
1418  def __init__(
1419      self,
1420      path_or_url: _Path,
1421      *,
1422      stream_index: int = 0,
1423      output_format: str = 'rgb',
1424      dtype: _DTypeLike = np.uint8,
1425  ):
1426    if output_format not in {'rgb', 'yuv', 'gray'}:
1427      raise ValueError(
1428          f'Output format {output_format} is not rgb, yuv, or gray.'
1429      )
1430    self.path_or_url = path_or_url
1431    self.output_format = output_format
1432    self.stream_index = stream_index
1433    self.dtype = np.dtype(dtype)
1434    if self.dtype.type not in (np.uint8, np.uint16):
1435      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1436    self._read_via_local_file: Any = None
1437    self._popen: subprocess.Popen[bytes] | None = None
1438    self._proc: subprocess.Popen[bytes] | None = None
1439
1440  def __enter__(self) -> 'VideoReader':
1441    try:
1442      self._read_via_local_file = _read_via_local_file(self.path_or_url)
1443      # pylint: disable-next=no-member
1444      tmp_name = self._read_via_local_file.__enter__()
1445
1446      self.metadata = _get_video_metadata(tmp_name)
1447      self.num_images, self.shape, self.fps, self.bps = self.metadata
1448      pix_fmt = self._get_pix_fmt(self.dtype, self.output_format)
1449      num_channels = {'rgb': 3, 'yuv': 3, 'gray': 1}[self.output_format]
1450      bytes_per_channel = self.dtype.itemsize
1451      self._num_bytes_per_image = (
1452          math.prod(self.shape) * num_channels * bytes_per_channel
1453      )
1454
1455      command = [
1456          '-v',
1457          'panic',
1458          '-nostdin',
1459          '-i',
1460          tmp_name,
1461          '-vcodec',
1462          'rawvideo',
1463          '-f',
1464          'image2pipe',
1465          '-map',
1466          f'0:v:{self.stream_index}',
1467          '-pix_fmt',
1468          pix_fmt,
1469          '-vsync',
1470          'vfr',
1471          '-',
1472      ]
1473      self._popen = _run_ffmpeg(
1474          command,
1475          stdout=subprocess.PIPE,
1476          stderr=subprocess.PIPE,
1477          allowed_input_files=[tmp_name],
1478      )
1479      self._proc = self._popen.__enter__()
1480    except Exception:
1481      self.__exit__(None, None, None)
1482      raise
1483    return self
1484
1485  def __exit__(self, *_: Any) -> None:
1486    self.close()
1487
1488  def read(self) -> _NDArray | None:
1489    """Reads a video image frame (or None if at end of file).
1490
1491    Returns:
1492      A numpy array in the format specified by `output_format`, i.e., a 3D
1493      array with 3 color channels, except for format 'gray' which is 2D.
1494    """
1495    assert self._proc, 'Error: reading from an already closed context.'
1496    stdout = self._proc.stdout
1497    assert stdout is not None
1498    data = stdout.read(self._num_bytes_per_image)
1499    if not data:  # Due to either end-of-file or subprocess error.
1500      self.close()  # Raises exception if subprocess had error.
1501      return None  # To indicate end-of-file.
1502    assert len(data) == self._num_bytes_per_image
1503    image = np.frombuffer(data, dtype=self.dtype)
1504    if self.output_format == 'rgb':
1505      image = image.reshape(*self.shape, 3)
1506    elif self.output_format == 'yuv':  # Convert from planar YUV to pixel YUV.
1507      image = np.moveaxis(image.reshape(3, *self.shape), 0, 2)
1508    elif self.output_format == 'gray':  # Generate 2D rather than 3D ndimage.
1509      image = image.reshape(*self.shape)
1510    else:
1511      raise AssertionError
1512    return image
1513
1514  def __iter__(self) -> Iterator[_NDArray]:
1515    while True:
1516      image = self.read()
1517      if image is None:
1518        return
1519      yield image
1520
1521  def close(self) -> None:
1522    """Terminates video reader.  (Called automatically at end of context.)"""
1523    if self._popen:
1524      self._popen.__exit__(None, None, None)
1525      self._popen = None
1526      self._proc = None
1527    if self._read_via_local_file:
1528      # pylint: disable-next=no-member
1529      self._read_via_local_file.__exit__(None, None, None)
1530      self._read_via_local_file = None
1531
1532
1533class VideoWriter(_VideoIO):
1534  """Context to write a compressed video.
1535
1536  >>> shape = 480, 640
1537  >>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer:
1538  ...   for image in moving_circle(shape, num_images=60):
1539  ...     writer.add_image(image)
1540  >>> show_video(read_video('/tmp/v.mp4'))
1541
1542
1543  Bitrate control may be specified using at most one of: `bps`, `qp`, or `crf`.
1544  If none are specified, `qp` is set to a default value.
1545  See https://slhck.info/video/2017/03/01/rate-control.html
1546
1547  If codec is 'gif', the args `bps`, `qp`, `crf`, and `encoded_format` are
1548  ignored.
1549
1550  Attributes:
1551    path: Output video.  Its suffix (e.g. '.mp4') determines the video container
1552      format.  The suffix must be '.gif' if the codec is 'gif'.
1553    shape: 2D spatial dimensions (height, width) of video image frames.  The
1554      dimensions must be even if 'encoded_format' has subsampled chroma (e.g.,
1555      'yuv420p' or 'yuv420p10le').
1556    codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264',
1557      'hevc', 'vp9', or 'gif').
1558    metadata: Optional VideoMetadata object whose `fps` and `bps` attributes are
1559      used if not specified as explicit parameters.
1560    fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif').
1561    bps: Requested average bits-per-second bitrate (default None).
1562    qp: Quantization parameter for video compression quality (default None).
1563    crf: Constant rate factor for video compression quality (default None).
1564    ffmpeg_args: Additional arguments for `ffmpeg` command, e.g. '-g 30' to
1565      introduce I-frames, or '-bf 0' to omit B-frames.
1566    input_format: Format of input images (default 'rgb').  If 'rgb', each image
1567      has shape=(height, width, 3) or (height, width).  If 'yuv', each image has
1568      shape=(height, width, 3) with Y, U, V values.  If 'gray', each image has
1569      shape=(height, width).
1570    dtype: Expected data type for input images (any float input images are
1571      converted to `dtype`).  The default is `np.uint8`.  Use of `np.uint16` is
1572      necessary when encoding >8 bits/channel.
1573    encoded_format: Pixel format as defined by `ffmpeg -pix_fmts`, e.g.,
1574      'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma),
1575      'yuv420p10le' (10-bit per channel), etc.  The default (None) selects
1576      'yuv420p' if all shape dimensions are even, else 'yuv444p'.
1577  """
1578
1579  def __init__(
1580      self,
1581      path: _Path,
1582      shape: tuple[int, int],
1583      *,
1584      codec: str = 'h264',
1585      metadata: VideoMetadata | None = None,
1586      fps: float | None = None,
1587      bps: int | None = None,
1588      qp: int | None = None,
1589      crf: float | None = None,
1590      ffmpeg_args: str | Sequence[str] = '',
1591      input_format: str = 'rgb',
1592      dtype: _DTypeLike = np.uint8,
1593      encoded_format: str | None = None,
1594  ) -> None:
1595    _check_2d_shape(shape)
1596    if fps is None and metadata:
1597      fps = metadata.fps
1598    if fps is None:
1599      fps = 25.0 if codec == 'gif' else 60.0
1600    if fps <= 0.0:
1601      raise ValueError(f'Frame-per-second value {fps} is invalid.')
1602    if bps is None and metadata:
1603      bps = metadata.bps
1604    bps = int(bps) if bps is not None else None
1605    if bps is not None and bps <= 0:
1606      raise ValueError(f'Bitrate value {bps} is invalid.')
1607    if qp is not None and (not isinstance(qp, int) or qp < 0):
1608      raise ValueError(
1609          f'Quantization parameter {qp} cannot be negative. It must be a'
1610          ' non-negative integer.'
1611      )
1612    num_rate_specifications = sum(x is not None for x in (bps, qp, crf))
1613    if num_rate_specifications > 1:
1614      raise ValueError(
1615          f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).'
1616      )
1617    ffmpeg_args = (
1618        shlex.split(ffmpeg_args)
1619        if isinstance(ffmpeg_args, str)
1620        else list(ffmpeg_args)
1621    )
1622    if input_format not in {'rgb', 'yuv', 'gray'}:
1623      raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.')
1624    dtype = np.dtype(dtype)
1625    if dtype.type not in (np.uint8, np.uint16):
1626      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1627    self.path = pathlib.Path(path)
1628    self.shape = shape
1629    all_dimensions_are_even = all(dim % 2 == 0 for dim in shape)
1630    if encoded_format is None:
1631      encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p'
1632    if not all_dimensions_are_even and encoded_format.startswith(
1633        ('yuv42', 'yuvj42')
1634    ):
1635      raise ValueError(
1636          f'With encoded_format {encoded_format}, video dimensions must be'
1637          f' even, but shape is {shape}.'
1638      )
1639    self.fps = fps
1640    self.codec = codec
1641    self.bps = bps
1642    self.qp = qp
1643    self.crf = crf
1644    self.ffmpeg_args = ffmpeg_args
1645    self.input_format = input_format
1646    self.dtype = dtype
1647    self.encoded_format = encoded_format
1648    if num_rate_specifications == 0 and not ffmpeg_args:
1649      qp = 20 if math.prod(self.shape) <= 640 * 480 else 28
1650    self._bitrate_args = (
1651        (['-vb', f'{bps}'] if bps is not None else [])
1652        + (['-qp', f'{qp}'] if qp is not None else [])
1653        + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else [])
1654    )
1655    if self.codec == 'gif':
1656      if self.path.suffix != '.gif':
1657        raise ValueError(f"File '{self.path}' does not have a .gif suffix.")
1658      self.encoded_format = 'pal8'
1659      self._bitrate_args = []
1660      video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse'
1661      # Less common (and likely less useful) is a per-frame color palette:
1662      # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];'
1663      #                 '[s1][p]paletteuse=new=1')
1664      self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args
1665    self._write_via_local_file: Any = None
1666    self._popen: subprocess.Popen[bytes] | None = None
1667    self._proc: subprocess.Popen[bytes] | None = None
1668
1669  def __enter__(self) -> 'VideoWriter':
1670    input_pix_fmt = self._get_pix_fmt(self.dtype, self.input_format)
1671    try:
1672      self._write_via_local_file = _write_via_local_file(self.path)
1673      # pylint: disable-next=no-member
1674      tmp_name = self._write_via_local_file.__enter__()
1675
1676      # Writing to stdout using ('-f', 'mp4', '-') would require
1677      # ('-movflags', 'frag_keyframe+empty_moov') and the result is nonportable.
1678      height, width = self.shape
1679      command = (
1680          [
1681              '-v',
1682              'error',
1683              '-f',
1684              'rawvideo',
1685              '-vcodec',
1686              'rawvideo',
1687              '-pix_fmt',
1688              input_pix_fmt,
1689              '-s',
1690              f'{width}x{height}',
1691              '-r',
1692              f'{self.fps}',
1693              '-i',
1694              '-',
1695              '-an',
1696              '-vcodec',
1697              self.codec,
1698              '-pix_fmt',
1699              self.encoded_format,
1700          ]
1701          + self._bitrate_args
1702          + self.ffmpeg_args
1703          + ['-y', tmp_name]
1704      )
1705      self._popen = _run_ffmpeg(
1706          command,
1707          stdin=subprocess.PIPE,
1708          stderr=subprocess.PIPE,
1709          allowed_output_files=[tmp_name],
1710      )
1711      self._proc = self._popen.__enter__()
1712    except Exception:
1713      self.__exit__(None, None, None)
1714      raise
1715    return self
1716
1717  def __exit__(self, *_: Any) -> None:
1718    self.close()
1719
1720  def add_image(self, image: _NDArray) -> None:
1721    """Writes a video frame.
1722
1723    Args:
1724      image: Array whose dtype and first two dimensions must match the `dtype`
1725        and `shape` specified in `VideoWriter` initialization.  If
1726        `input_format` is 'gray', the image must be 2D.  For the 'rgb'
1727        input_format, the image may be either 2D (interpreted as grayscale) or
1728        3D with three (R, G, B) channels.  For the 'yuv' input_format, the image
1729        must be 3D with three (Y, U, V) channels.
1730
1731    Raises:
1732      RuntimeError: If there is an error writing to the output file.
1733    """
1734    assert self._proc, 'Error: writing to an already closed context.'
1735    if issubclass(image.dtype.type, (np.floating, np.bool_)):
1736      image = to_type(image, self.dtype)
1737    if image.dtype != self.dtype:
1738      raise ValueError(f'Image type {image.dtype} != {self.dtype}.')
1739    if self.input_format == 'gray':
1740      if image.ndim != 2:
1741        raise ValueError(f'Image dimensions {image.shape} are not 2D.')
1742    else:
1743      if image.ndim == 2 and self.input_format == 'rgb':
1744        image = np.dstack((image, image, image))
1745      if not (image.ndim == 3 and image.shape[2] == 3):
1746        raise ValueError(f'Image dimensions {image.shape} are invalid.')
1747    if image.shape[:2] != self.shape:
1748      raise ValueError(
1749          f'Image dimensions {image.shape[:2]} do not match'
1750          f' those of the initialized video {self.shape}.'
1751      )
1752    if self.input_format == 'yuv':  # Convert from per-pixel YUV to planar YUV.
1753      image = np.moveaxis(image, 2, 0)
1754    data = image.tobytes()
1755    stdin = self._proc.stdin
1756    assert stdin is not None
1757    if stdin.write(data) != len(data):
1758      self._proc.wait()
1759      stderr = self._proc.stderr
1760      assert stderr is not None
1761      s = stderr.read().decode('utf-8')
1762      raise RuntimeError(f"Error writing '{self.path}': {s}")
1763
1764  def close(self) -> None:
1765    """Finishes writing the video.  (Called automatically at end of context.)"""
1766    if self._popen:
1767      assert self._proc, 'Error: closing an already closed context.'
1768      stdin = self._proc.stdin
1769      assert stdin is not None
1770      stdin.close()
1771      if self._proc.wait():
1772        stderr = self._proc.stderr
1773        assert stderr is not None
1774        s = stderr.read().decode('utf-8')
1775        raise RuntimeError(f"Error writing '{self.path}': {s}")
1776      self._popen.__exit__(None, None, None)
1777      self._popen = None
1778      self._proc = None
1779    if self._write_via_local_file:
1780      # pylint: disable-next=no-member
1781      self._write_via_local_file.__exit__(None, None, None)
1782      self._write_via_local_file = None
1783
1784
1785class _VideoArray(npt.NDArray[Any]):
1786  """Wrapper to add a VideoMetadata `metadata` attribute to a numpy array."""
1787
1788  metadata: VideoMetadata | None
1789
1790  def __new__(
1791      cls: Type['_VideoArray'],
1792      input_array: _NDArray,
1793      metadata: VideoMetadata | None = None,
1794  ) -> '_VideoArray':
1795    obj: _VideoArray = np.asarray(input_array).view(cls)
1796    obj.metadata = metadata
1797    return obj
1798
1799  def __array_finalize__(self, obj: Any) -> None:
1800    if obj is None:
1801      return
1802    self.metadata = getattr(obj, 'metadata', None)
1803
1804
1805def read_video(path_or_url: _Path, **kwargs: Any) -> _VideoArray:
1806  """Returns an array containing all images read from a compressed video file.
1807
1808  >>> video = read_video('/tmp/river.mp4')
1809  >>> print(f'The framerate is {video.metadata.fps} frames/s.')
1810  >>> show_video(video)
1811
1812  >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
1813  >>> show_video(read_video(url))
1814
1815  Args:
1816    path_or_url: Input video file.
1817    **kwargs: Additional parameters for `VideoReader`.
1818
1819  Returns:
1820    A 4D `numpy` array with dimensions (frame, height, width, channel), or a 3D
1821    array if `output_format` is specified as 'gray'.  The returned array has an
1822    attribute `metadata` containing `VideoMetadata` information.  This enables
1823    `show_video` to retrieve the framerate in `metadata.fps`.  Note that the
1824    metadata attribute is lost in most subsequent `numpy` operations.
1825  """
1826  with VideoReader(path_or_url, **kwargs) as reader:
1827    return _VideoArray(np.array(tuple(reader)), metadata=reader.metadata)
1828
1829
1830def write_video(path: _Path, images: Iterable[_NDArray], **kwargs: Any) -> None:
1831  """Writes images to a compressed video file.
1832
1833  >>> video = moving_circle((480, 640), num_images=60)
1834  >>> write_video('/tmp/v.mp4', video, fps=60, qp=18)
1835  >>> show_video(read_video('/tmp/v.mp4'))
1836
1837  Args:
1838    path: Output video file.
1839    images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D
1840      arrays.
1841    **kwargs: Additional parameters for `VideoWriter`.
1842  """
1843  first_image, images = _peek_first(images)
1844  shape = first_image.shape[0], first_image.shape[1]
1845  dtype = first_image.dtype
1846  if dtype == bool:
1847    dtype = np.dtype(np.uint8)
1848  elif np.issubdtype(dtype, np.floating):
1849    dtype = np.dtype(np.uint16)
1850  kwargs = {'metadata': getattr(images, 'metadata', None), **kwargs}
1851  with VideoWriter(path, shape=shape, dtype=dtype, **kwargs) as writer:
1852    for image in images:
1853      writer.add_image(image)
1854
1855
1856def compress_video(
1857    images: Iterable[_NDArray], *, codec: str = 'h264', **kwargs: Any
1858) -> bytes:
1859  """Returns a buffer containing a compressed video.
1860
1861  The video container is 'gif' for 'gif' codec, 'webm' for 'vp9' codec,
1862  and mp4 otherwise.
1863
1864  >>> video = read_video('/tmp/river.mp4')
1865  >>> data = compress_video(video, bps=10_000_000)
1866  >>> print(len(data))
1867
1868  >>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10)
1869
1870  Args:
1871    images: Iterable over video frames.
1872    codec: Compression algorithm as defined by `ffmpeg -codecs` (e.g., 'h264',
1873      'hevc', 'vp9', or 'gif').
1874    **kwargs: Additional parameters for `VideoWriter`.
1875
1876  Returns:
1877    A bytes buffer containing the compressed video.
1878  """
1879  suffix = _filename_suffix_from_codec(codec)
1880  with tempfile.TemporaryDirectory() as directory_name:
1881    tmp_path = pathlib.Path(directory_name) / f'file{suffix}'
1882    write_video(tmp_path, images, codec=codec, **kwargs)
1883    return tmp_path.read_bytes()
1884
1885
1886def decompress_video(data: bytes, **kwargs: Any) -> _NDArray:
1887  """Returns video images from an MP4-compressed data buffer."""
1888  with tempfile.TemporaryDirectory() as directory_name:
1889    tmp_path = pathlib.Path(directory_name) / 'file.mp4'
1890    tmp_path.write_bytes(data)
1891    return read_video(tmp_path, **kwargs)
1892
1893
1894def html_from_compressed_video(
1895    data: bytes,
1896    width: int,
1897    height: int,
1898    *,
1899    title: str | None = None,
1900    border: bool | str = False,
1901    loop: bool = True,
1902    autoplay: bool = True,
1903) -> str:
1904  """Returns an HTML string with a video tag containing H264-encoded data.
1905
1906  Args:
1907    data: MP4-compressed video bytes.
1908    width: Width of HTML video in pixels.
1909    height: Height of HTML video in pixels.
1910    title: Optional text shown centered above the video.
1911    border: If `bool`, whether to place a black boundary around the image, or if
1912      `str`, the boundary CSS style.
1913    loop: If True, the playback repeats forever.
1914    autoplay: If True, video playback starts without having to click.
1915  """
1916  b64 = base64.b64encode(data).decode('utf-8')
1917  if isinstance(border, str):
1918    border = f'{border}; '
1919  elif border:
1920    border = 'border:1px solid black; '
1921  else:
1922    border = ''
1923  options = (
1924      f'controls width="{width}" height="{height}"'
1925      f' style="{border}object-fit:cover;"'
1926      f'{" loop" if loop else ""}'
1927      f'{" autoplay muted" if autoplay else ""}'
1928  )
1929  s = f"""<video {options}>
1930      <source src="data:video/mp4;base64,{b64}" type="video/mp4"/>
1931      This browser does not support the video tag.
1932      </video>"""
1933  if title is not None:
1934    s = f"""<div style="display:flex; align-items:left;">
1935      <div style="display:flex; flex-direction:column; align-items:center;">
1936      <div>{title}</div><div>{s}</div></div></div>"""
1937  return s
1938
1939
1940def show_video(
1941    images: Iterable[_NDArray], *, title: str | None = None, **kwargs: Any
1942) -> str | None:
1943  """Displays a video in the IPython notebook and optionally saves it to a file.
1944
1945  See `show_videos`.
1946
1947  >>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4')
1948  >>> show_video(video, title='River video')
1949
1950  >>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True)
1951
1952  >>> show_video(read_video('/tmp/river.mp4'))
1953
1954  Args:
1955    images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D
1956      arrays).
1957    title: Optional text shown centered above the video.
1958    **kwargs: See `show_videos`.
1959
1960  Returns:
1961    html string if `return_html` is `True`.
1962  """
1963  return show_videos([images], [title], **kwargs)
1964
1965
1966def show_videos(
1967    videos: Iterable[Iterable[_NDArray]] | Mapping[str, Iterable[_NDArray]],
1968    titles: Iterable[str | None] | None = None,
1969    *,
1970    width: int | None = None,
1971    height: int | None = None,
1972    downsample: bool = True,
1973    columns: int | None = None,
1974    fps: float | None = None,
1975    bps: int | None = None,
1976    qp: int | None = None,
1977    codec: str = 'h264',
1978    ylabel: str = '',
1979    html_class: str = 'show_videos',
1980    return_html: bool = False,
1981    **kwargs: Any,
1982) -> str | None:
1983  """Displays a row of videos in the IPython notebook.
1984
1985  Creates HTML with `<video>` tags containing embedded H264-encoded bytestrings.
1986  If `codec` is set to 'gif', we instead use `<img>` tags containing embedded
1987  GIF-encoded bytestrings.  Note that the resulting GIF animations skip frames
1988  when the `fps` period is not a multiple of 10 ms units (GIF frame delay
1989  units).  Encoding at `fps` = 20.0, 25.0, or 50.0 works fine.
1990
1991  If a directory has been specified using `set_show_save_dir`, also saves each
1992  titled video to a file in that directory based on its title.
1993
1994  Args:
1995    videos: Iterable of videos, or dictionary of `{title: video}`.  Each video
1996      must be an iterable of images.  If a video object has a `metadata`
1997      (`VideoMetadata`) attribute, its `fps` field provides a default framerate.
1998    titles: Optional strings shown above the corresponding videos.
1999    width: Optional, overrides displayed width (in pixels).
2000    height: Optional, overrides displayed height (in pixels).
2001    downsample: If True, each video whose width or height is greater than the
2002      specified `width` or `height` is resampled to the display resolution. This
2003      improves antialiasing and reduces the size of the notebook.
2004    columns: Optional, maximum number of videos per row.
2005    fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF).
2006    bps: Bits-per-second bitrate (default None).
2007    qp: Quantization parameter for video compression quality (default None).
2008    codec: Compression algorithm; must be either 'h264' or 'gif'.
2009    ylabel: Text (rotated by 90 degrees) shown on the left of each row.
2010    html_class: CSS class name used in definition of HTML element.
2011    return_html: If `True` return the raw HTML `str` instead of displaying.
2012    **kwargs: Additional parameters (`border`, `loop`, `autoplay`) for
2013      `html_from_compressed_video`.
2014
2015  Returns:
2016    html string if `return_html` is `True`.
2017  """
2018  if isinstance(videos, Mapping):
2019    if titles is not None:
2020      raise ValueError(
2021          'Cannot have both a video dictionary and a titles parameter.'
2022      )
2023    list_titles = list(videos.keys())
2024    list_videos = list(videos.values())
2025  else:
2026    list_videos = list(cast('Iterable[_NDArray]', videos))
2027    list_titles = [None] * len(list_videos) if titles is None else list(titles)
2028    if len(list_videos) != len(list_titles):
2029      raise ValueError(
2030          'Number of videos does not match number of titles'
2031          f' ({len(list_videos)} vs {len(list_titles)}).'
2032      )
2033  if codec not in {'h264', 'gif'}:
2034    raise ValueError(f'Codec {codec} is neither h264 or gif.')
2035
2036  html_strings = []
2037  for video, title in zip(list_videos, list_titles):
2038    metadata: VideoMetadata | None = getattr(video, 'metadata', None)
2039    first_image, video = _peek_first(video)
2040    w, h = _get_width_height(width, height, first_image.shape[:2])
2041    if downsample and (w < first_image.shape[1] or h < first_image.shape[0]):
2042      # Not resize_video() because each image may have different depth and type.
2043      video = [resize_image(image, (h, w)) for image in video]
2044      first_image = video[0]
2045    data = compress_video(
2046        video, metadata=metadata, fps=fps, bps=bps, qp=qp, codec=codec
2047    )
2048    if title is not None and _config.show_save_dir:
2049      suffix = _filename_suffix_from_codec(codec)
2050      path = pathlib.Path(_config.show_save_dir) / f'{title}{suffix}'
2051      with _open(path, mode='wb') as f:
2052        f.write(data)
2053    if codec == 'gif':
2054      pixelated = h > first_image.shape[0] or w > first_image.shape[1]
2055      html_string = html_from_compressed_image(
2056          data, w, h, title=title, fmt='gif', pixelated=pixelated, **kwargs
2057      )
2058    else:
2059      html_string = html_from_compressed_video(
2060          data, w, h, title=title, **kwargs
2061      )
2062    html_strings.append(html_string)
2063
2064  # Create single-row tables each with no more than 'columns' elements.
2065  table_strings = []
2066  for row_html_strings in _chunked(html_strings, columns):
2067    td = '<td style="padding:1px;">'
2068    s = ''.join(f'{td}{e}</td>' for e in row_html_strings)
2069    if ylabel:
2070      style = 'writing-mode:vertical-lr; transform:rotate(180deg);'
2071      s = f'{td}<span style="{style}">{ylabel}</span></td>' + s
2072    table_strings.append(
2073        f'<table class="{html_class}"'
2074        f' style="border-spacing:0px;"><tr>{s}</tr></table>'
2075    )
2076  s = ''.join(table_strings)
2077  if return_html:
2078    return s
2079  _display_html(s)
2080  return None
2081
2082
2083# Local Variables:
2084# fill-column: 80
2085# End:
def show_image( image: ArrayLike, *, title: str | None = None, **kwargs: Any) -> str | None:
977def show_image(
978    image: _ArrayLike, *, title: str | None = None, **kwargs: Any
979) -> str | None:
980  """Displays an image in the notebook and optionally saves it to a file.
981
982  See `show_images`.
983
984  >>> show_image(np.random.rand(100, 100))
985  >>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8'))
986  >>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100)
987  >>> show_image(read_image('/tmp/image.png'))
988  >>> url = 'https://github.com/hhoppe/data/raw/main/image.png'
989  >>> show_image(read_image(url))
990
991  Args:
992    image: 2D array-like, or 3D array-like with 1, 3, or 4 channels.
993    title: Optional text shown centered above the image.
994    **kwargs: See `show_images`.
995
996  Returns:
997    html string if `return_html` is `True`.
998  """
999  return show_images([np.asarray(image)], [title], **kwargs)

Displays an image in the notebook and optionally saves it to a file.

See show_images.

>>> show_image(np.random.rand(100, 100))
>>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8'))
>>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100)
>>> show_image(read_image('/tmp/image.png'))
>>> url = 'https://github.com/hhoppe/data/raw/main/image.png'
>>> show_image(read_image(url))
Arguments:
  • image: 2D array-like, or 3D array-like with 1, 3, or 4 channels.
  • title: Optional text shown centered above the image.
  • **kwargs: See show_images.
Returns:

html string if return_html is True.

def show_images( images: Iterable[ArrayLike] | Mapping[str, ArrayLike], titles: Iterable[str | None] | None = None, *, width: int | None = None, height: int | None = None, downsample: bool = True, columns: int | None = None, vmin: float | None = None, vmax: float | None = None, cmap: str | Callable[[ArrayLike], np.ndarray] = 'gray', border: bool | str = False, ylabel: str = '', html_class: str = 'show_images', pixelated: bool | None = None, return_html: bool = False) -> str | None:
1002def show_images(
1003    images: Iterable[_ArrayLike] | Mapping[str, _ArrayLike],
1004    titles: Iterable[str | None] | None = None,
1005    *,
1006    width: int | None = None,
1007    height: int | None = None,
1008    downsample: bool = True,
1009    columns: int | None = None,
1010    vmin: float | None = None,
1011    vmax: float | None = None,
1012    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
1013    border: bool | str = False,
1014    ylabel: str = '',
1015    html_class: str = 'show_images',
1016    pixelated: bool | None = None,
1017    return_html: bool = False,
1018) -> str | None:
1019  """Displays a row of images in the IPython/Jupyter notebook.
1020
1021  If a directory has been specified using `set_show_save_dir`, also saves each
1022  titled image to a file in that directory based on its title.
1023
1024  >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
1025  >>> show_images([image1, image2])
1026  >>> show_images({'random image': image1, 'color ramp': image2}, height=128)
1027  >>> show_images([image1, image2] * 5, columns=4, border=True)
1028
1029  Args:
1030    images: Iterable of images, or dictionary of `{title: image}`.  Each image
1031      must be either a 2D array or a 3D array with 1, 3, or 4 channels.
1032    titles: Optional strings shown above the corresponding images.
1033    width: Optional, overrides displayed width (in pixels).
1034    height: Optional, overrides displayed height (in pixels).
1035    downsample: If True, each image whose width or height is greater than the
1036      specified `width` or `height` is resampled to the display resolution. This
1037      improves antialiasing and reduces the size of the notebook.
1038    columns: Optional, maximum number of images per row.
1039    vmin: For single-channel image, explicit min value for display.
1040    vmax: For single-channel image, explicit max value for display.
1041    cmap: For single-channel image, `pyplot` color map or callable to map 1D to
1042      3D color.
1043    border: If `bool`, whether to place a black boundary around the image, or if
1044      `str`, the boundary CSS style.
1045    ylabel: Text (rotated by 90 degrees) shown on the left of each row.
1046    html_class: CSS class name used in definition of HTML element.
1047    pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if
1048      False, sets 'image-rendering: auto'; if None, uses pixelated rendering
1049      only on images for which `width` or `height` introduces magnification.
1050    return_html: If `True` return the raw HTML `str` instead of displaying.
1051
1052  Returns:
1053    html string if `return_html` is `True`.
1054  """
1055  if isinstance(images, Mapping):
1056    if titles is not None:
1057      raise ValueError('Cannot have images dictionary and titles parameter.')
1058    list_titles, list_images = list(images.keys()), list(images.values())
1059  else:
1060    list_images = list(images)
1061    list_titles = [None] * len(list_images) if titles is None else list(titles)
1062    if len(list_images) != len(list_titles):
1063      raise ValueError(
1064          'Number of images does not match number of titles'
1065          f' ({len(list_images)} vs {len(list_titles)}).'
1066      )
1067
1068  list_images = [
1069      _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
1070      for image in list_images
1071  ]
1072
1073  def maybe_downsample(image: _NDArray) -> _NDArray:
1074    shape = image.shape[0], image.shape[1]
1075    w, h = _get_width_height(width, height, shape)
1076    if w < shape[1] or h < shape[0]:
1077      image = resize_image(image, (h, w))
1078    return image
1079
1080  if downsample:
1081    list_images = [maybe_downsample(image) for image in list_images]
1082  png_datas = [compress_image(to_uint8(image)) for image in list_images]
1083
1084  for title, png_data in zip(list_titles, png_datas):
1085    if title is not None and _config.show_save_dir:
1086      path = pathlib.Path(_config.show_save_dir) / f'{title}.png'
1087      with _open(path, mode='wb') as f:
1088        f.write(png_data)
1089
1090  def html_from_compressed_images() -> str:
1091    html_strings = []
1092    for image, title, png_data in zip(list_images, list_titles, png_datas):
1093      w, h = _get_width_height(width, height, image.shape[:2])
1094      magnified = h > image.shape[0] or w > image.shape[1]
1095      pixelated2 = pixelated if pixelated is not None else magnified
1096      html_strings.append(
1097          html_from_compressed_image(
1098              png_data, w, h, title=title, border=border, pixelated=pixelated2
1099          )
1100      )
1101    # Create single-row tables each with no more than 'columns' elements.
1102    table_strings = []
1103    for row_html_strings in _chunked(html_strings, columns):
1104      td = '<td style="padding:1px;">'
1105      s = ''.join(f'{td}{e}</td>' for e in row_html_strings)
1106      if ylabel:
1107        style = 'writing-mode:vertical-lr; transform:rotate(180deg);'
1108        s = f'{td}<span style="{style}">{ylabel}</span></td>' + s
1109      table_strings.append(
1110          f'<table class="{html_class}"'
1111          f' style="border-spacing:0px;"><tr>{s}</tr></table>'
1112      )
1113    return ''.join(table_strings)
1114
1115  s = html_from_compressed_images()
1116  while len(s) > _IPYTHON_HTML_SIZE_LIMIT * 0.5:
1117    warnings.warn('mediapy: subsampling images to reduce HTML size')
1118    list_images = [image[::2, ::2] for image in list_images]
1119    png_datas = [compress_image(to_uint8(image)) for image in list_images]
1120    s = html_from_compressed_images()
1121  if return_html:
1122    return s
1123  _display_html(s)
1124  return None

Displays a row of images in the IPython/Jupyter notebook.

If a directory has been specified using set_show_save_dir, also saves each titled image to a file in that directory based on its title.

>>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
>>> show_images([image1, image2])
>>> show_images({'random image': image1, 'color ramp': image2}, height=128)
>>> show_images([image1, image2] * 5, columns=4, border=True)
Arguments:
  • images: Iterable of images, or dictionary of {title: image}. Each image must be either a 2D array or a 3D array with 1, 3, or 4 channels.
  • titles: Optional strings shown above the corresponding images.
  • width: Optional, overrides displayed width (in pixels).
  • height: Optional, overrides displayed height (in pixels).
  • downsample: If True, each image whose width or height is greater than the specified width or height is resampled to the display resolution. This improves antialiasing and reduces the size of the notebook.
  • columns: Optional, maximum number of images per row.
  • vmin: For single-channel image, explicit min value for display.
  • vmax: For single-channel image, explicit max value for display.
  • cmap: For single-channel image, pyplot color map or callable to map 1D to 3D color.
  • border: If bool, whether to place a black boundary around the image, or if str, the boundary CSS style.
  • ylabel: Text (rotated by 90 degrees) shown on the left of each row.
  • html_class: CSS class name used in definition of HTML element.
  • pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if False, sets 'image-rendering: auto'; if None, uses pixelated rendering only on images for which width or height introduces magnification.
  • return_html: If True return the raw HTML str instead of displaying.
Returns:

html string if return_html is True.

def compare_images( images: Iterable[ArrayLike], *, vmin: float | None = None, vmax: float | None = None, cmap: str | Callable[[ArrayLike], np.ndarray] = 'gray') -> None:
1127def compare_images(
1128    images: Iterable[_ArrayLike],
1129    *,
1130    vmin: float | None = None,
1131    vmax: float | None = None,
1132    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
1133) -> None:
1134  """Compare two images using an interactive slider.
1135
1136  Displays an HTML slider component to interactively swipe between two images.
1137  The slider functionality requires that the web browser have Internet access.
1138  See additional info in `https://github.com/sneas/img-comparison-slider`.
1139
1140  >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
1141  >>> compare_images([image1, image2])
1142
1143  Args:
1144    images: Iterable of images.  Each image must be either a 2D array or a 3D
1145      array with 1, 3, or 4 channels.  There must be exactly two images.
1146    vmin: For single-channel image, explicit min value for display.
1147    vmax: For single-channel image, explicit max value for display.
1148    cmap: For single-channel image, `pyplot` color map or callable to map 1D to
1149      3D color.
1150  """
1151  list_images = [
1152      _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap)
1153      for image in images
1154  ]
1155  if len(list_images) != 2:
1156    raise ValueError('The number of images must be 2.')
1157  png_datas = [compress_image(to_uint8(image)) for image in list_images]
1158  b64_1, b64_2 = [
1159      base64.b64encode(png_data).decode('utf-8') for png_data in png_datas
1160  ]
1161  s = _IMAGE_COMPARISON_HTML.replace('{b64_1}', b64_1).replace('{b64_2}', b64_2)
1162  _display_html(s)

Compare two images using an interactive slider.

Displays an HTML slider component to interactively swipe between two images. The slider functionality requires that the web browser have Internet access. See additional info in https://github.com/sneas/img-comparison-slider.

>>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
>>> compare_images([image1, image2])
Arguments:
  • images: Iterable of images. Each image must be either a 2D array or a 3D array with 1, 3, or 4 channels. There must be exactly two images.
  • vmin: For single-channel image, explicit min value for display.
  • vmax: For single-channel image, explicit max value for display.
  • cmap: For single-channel image, pyplot color map or callable to map 1D to 3D color.
def show_video( images: Iterable[np.ndarray], *, title: str | None = None, **kwargs: Any) -> str | None:
1941def show_video(
1942    images: Iterable[_NDArray], *, title: str | None = None, **kwargs: Any
1943) -> str | None:
1944  """Displays a video in the IPython notebook and optionally saves it to a file.
1945
1946  See `show_videos`.
1947
1948  >>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4')
1949  >>> show_video(video, title='River video')
1950
1951  >>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True)
1952
1953  >>> show_video(read_video('/tmp/river.mp4'))
1954
1955  Args:
1956    images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D
1957      arrays).
1958    title: Optional text shown centered above the video.
1959    **kwargs: See `show_videos`.
1960
1961  Returns:
1962    html string if `return_html` is `True`.
1963  """
1964  return show_videos([images], [title], **kwargs)

Displays a video in the IPython notebook and optionally saves it to a file.

See show_videos.

>>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4')
>>> show_video(video, title='River video')
>>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True)
>>> show_video(read_video('/tmp/river.mp4'))
Arguments:
  • images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D arrays).
  • title: Optional text shown centered above the video.
  • **kwargs: See show_videos.
Returns:

html string if return_html is True.

def show_videos( videos: Iterable[Iterable[np.ndarray]] | Mapping[str, Iterable[np.ndarray]], titles: Iterable[str | None] | None = None, *, width: int | None = None, height: int | None = None, downsample: bool = True, columns: int | None = None, fps: float | None = None, bps: int | None = None, qp: int | None = None, codec: str = 'h264', ylabel: str = '', html_class: str = 'show_videos', return_html: bool = False, **kwargs: Any) -> str | None:
1967def show_videos(
1968    videos: Iterable[Iterable[_NDArray]] | Mapping[str, Iterable[_NDArray]],
1969    titles: Iterable[str | None] | None = None,
1970    *,
1971    width: int | None = None,
1972    height: int | None = None,
1973    downsample: bool = True,
1974    columns: int | None = None,
1975    fps: float | None = None,
1976    bps: int | None = None,
1977    qp: int | None = None,
1978    codec: str = 'h264',
1979    ylabel: str = '',
1980    html_class: str = 'show_videos',
1981    return_html: bool = False,
1982    **kwargs: Any,
1983) -> str | None:
1984  """Displays a row of videos in the IPython notebook.
1985
1986  Creates HTML with `<video>` tags containing embedded H264-encoded bytestrings.
1987  If `codec` is set to 'gif', we instead use `<img>` tags containing embedded
1988  GIF-encoded bytestrings.  Note that the resulting GIF animations skip frames
1989  when the `fps` period is not a multiple of 10 ms units (GIF frame delay
1990  units).  Encoding at `fps` = 20.0, 25.0, or 50.0 works fine.
1991
1992  If a directory has been specified using `set_show_save_dir`, also saves each
1993  titled video to a file in that directory based on its title.
1994
1995  Args:
1996    videos: Iterable of videos, or dictionary of `{title: video}`.  Each video
1997      must be an iterable of images.  If a video object has a `metadata`
1998      (`VideoMetadata`) attribute, its `fps` field provides a default framerate.
1999    titles: Optional strings shown above the corresponding videos.
2000    width: Optional, overrides displayed width (in pixels).
2001    height: Optional, overrides displayed height (in pixels).
2002    downsample: If True, each video whose width or height is greater than the
2003      specified `width` or `height` is resampled to the display resolution. This
2004      improves antialiasing and reduces the size of the notebook.
2005    columns: Optional, maximum number of videos per row.
2006    fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF).
2007    bps: Bits-per-second bitrate (default None).
2008    qp: Quantization parameter for video compression quality (default None).
2009    codec: Compression algorithm; must be either 'h264' or 'gif'.
2010    ylabel: Text (rotated by 90 degrees) shown on the left of each row.
2011    html_class: CSS class name used in definition of HTML element.
2012    return_html: If `True` return the raw HTML `str` instead of displaying.
2013    **kwargs: Additional parameters (`border`, `loop`, `autoplay`) for
2014      `html_from_compressed_video`.
2015
2016  Returns:
2017    html string if `return_html` is `True`.
2018  """
2019  if isinstance(videos, Mapping):
2020    if titles is not None:
2021      raise ValueError(
2022          'Cannot have both a video dictionary and a titles parameter.'
2023      )
2024    list_titles = list(videos.keys())
2025    list_videos = list(videos.values())
2026  else:
2027    list_videos = list(cast('Iterable[_NDArray]', videos))
2028    list_titles = [None] * len(list_videos) if titles is None else list(titles)
2029    if len(list_videos) != len(list_titles):
2030      raise ValueError(
2031          'Number of videos does not match number of titles'
2032          f' ({len(list_videos)} vs {len(list_titles)}).'
2033      )
2034  if codec not in {'h264', 'gif'}:
2035    raise ValueError(f'Codec {codec} is neither h264 or gif.')
2036
2037  html_strings = []
2038  for video, title in zip(list_videos, list_titles):
2039    metadata: VideoMetadata | None = getattr(video, 'metadata', None)
2040    first_image, video = _peek_first(video)
2041    w, h = _get_width_height(width, height, first_image.shape[:2])
2042    if downsample and (w < first_image.shape[1] or h < first_image.shape[0]):
2043      # Not resize_video() because each image may have different depth and type.
2044      video = [resize_image(image, (h, w)) for image in video]
2045      first_image = video[0]
2046    data = compress_video(
2047        video, metadata=metadata, fps=fps, bps=bps, qp=qp, codec=codec
2048    )
2049    if title is not None and _config.show_save_dir:
2050      suffix = _filename_suffix_from_codec(codec)
2051      path = pathlib.Path(_config.show_save_dir) / f'{title}{suffix}'
2052      with _open(path, mode='wb') as f:
2053        f.write(data)
2054    if codec == 'gif':
2055      pixelated = h > first_image.shape[0] or w > first_image.shape[1]
2056      html_string = html_from_compressed_image(
2057          data, w, h, title=title, fmt='gif', pixelated=pixelated, **kwargs
2058      )
2059    else:
2060      html_string = html_from_compressed_video(
2061          data, w, h, title=title, **kwargs
2062      )
2063    html_strings.append(html_string)
2064
2065  # Create single-row tables each with no more than 'columns' elements.
2066  table_strings = []
2067  for row_html_strings in _chunked(html_strings, columns):
2068    td = '<td style="padding:1px;">'
2069    s = ''.join(f'{td}{e}</td>' for e in row_html_strings)
2070    if ylabel:
2071      style = 'writing-mode:vertical-lr; transform:rotate(180deg);'
2072      s = f'{td}<span style="{style}">{ylabel}</span></td>' + s
2073    table_strings.append(
2074        f'<table class="{html_class}"'
2075        f' style="border-spacing:0px;"><tr>{s}</tr></table>'
2076    )
2077  s = ''.join(table_strings)
2078  if return_html:
2079    return s
2080  _display_html(s)
2081  return None

Displays a row of videos in the IPython notebook.

Creates HTML with <video> tags containing embedded H264-encoded bytestrings. If codec is set to 'gif', we instead use <img> tags containing embedded GIF-encoded bytestrings. Note that the resulting GIF animations skip frames when the fps period is not a multiple of 10 ms units (GIF frame delay units). Encoding at fps = 20.0, 25.0, or 50.0 works fine.

If a directory has been specified using set_show_save_dir, also saves each titled video to a file in that directory based on its title.

Arguments:
  • videos: Iterable of videos, or dictionary of {title: video}. Each video must be an iterable of images. If a video object has a metadata (VideoMetadata) attribute, its fps field provides a default framerate.
  • titles: Optional strings shown above the corresponding videos.
  • width: Optional, overrides displayed width (in pixels).
  • height: Optional, overrides displayed height (in pixels).
  • downsample: If True, each video whose width or height is greater than the specified width or height is resampled to the display resolution. This improves antialiasing and reduces the size of the notebook.
  • columns: Optional, maximum number of videos per row.
  • fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF).
  • bps: Bits-per-second bitrate (default None).
  • qp: Quantization parameter for video compression quality (default None).
  • codec: Compression algorithm; must be either 'h264' or 'gif'.
  • ylabel: Text (rotated by 90 degrees) shown on the left of each row.
  • html_class: CSS class name used in definition of HTML element.
  • return_html: If True return the raw HTML str instead of displaying.
  • **kwargs: Additional parameters (border, loop, autoplay) for html_from_compressed_video.
Returns:

html string if return_html is True.

def read_image( path_or_url: str | os.PathLike[str], *, apply_exif_transpose: bool = True, dtype: DTypeLike = None) -> np.ndarray:
769def read_image(
770    path_or_url: _Path,
771    *,
772    apply_exif_transpose: bool = True,
773    dtype: _DTypeLike = None,
774) -> _NDArray:
775  """Returns an image read from a file path or URL.
776
777  Decoding is performed using `PIL`, which supports `uint8` images with 1, 3,
778  or 4 channels and `uint16` images with a single channel.
779
780  Args:
781    path_or_url: Path of input file.
782    apply_exif_transpose: If True, rotate image according to EXIF orientation.
783    dtype: Data type of the returned array.  If None, `np.uint8` or `np.uint16`
784      is inferred automatically.
785  """
786  data = read_contents(path_or_url)
787  return decompress_image(data, dtype, apply_exif_transpose)

Returns an image read from a file path or URL.

Decoding is performed using PIL, which supports uint8 images with 1, 3, or 4 channels and uint16 images with a single channel.

Arguments:
  • path_or_url: Path of input file.
  • apply_exif_transpose: If True, rotate image according to EXIF orientation.
  • dtype: Data type of the returned array. If None, np.uint8 or np.uint16 is inferred automatically.
def write_image( path: str | os.PathLike[str], image: ArrayLike, fmt: str = 'png', **kwargs: Any) -> None:
790def write_image(
791    path: _Path, image: _ArrayLike, fmt: str = 'png', **kwargs: Any
792) -> None:
793  """Writes an image to a file.
794
795  Encoding is performed using `PIL`, which supports `uint8` images with 1, 3,
796  or 4 channels and `uint16` images with a single channel.
797
798  File format is explicitly provided by `fmt` and not inferred by `path`.
799
800  Args:
801    path: Path of output file.
802    image: Array-like object.  If its type is float, it is converted to np.uint8
803      using `to_uint8` (thus clamping to the input to the range [0.0, 1.0]).
804      Otherwise it must be np.uint8 or np.uint16.
805    fmt: Desired compression encoding, e.g. 'png'.
806    **kwargs: Additional parameters for `PIL.Image.save()`.
807  """
808  image = _as_valid_media_array(image)
809  if np.issubdtype(image.dtype, np.floating):
810    image = to_uint8(image)
811  with _open(path, 'wb') as f:
812    _pil_image(image).save(f, format=fmt, **kwargs)

Writes an image to a file.

Encoding is performed using PIL, which supports uint8 images with 1, 3, or 4 channels and uint16 images with a single channel.

File format is explicitly provided by fmt and not inferred by path.

Arguments:
  • path: Path of output file.
  • image: Array-like object. If its type is float, it is converted to np.uint8 using to_uint8 (thus clamping to the input to the range [0.0, 1.0]). Otherwise it must be np.uint8 or np.uint16.
  • fmt: Desired compression encoding, e.g. 'png'.
  • **kwargs: Additional parameters for PIL.Image.save().
def read_video( path_or_url: str | os.PathLike[str], **kwargs: Any) -> mediapy._VideoArray:
1806def read_video(path_or_url: _Path, **kwargs: Any) -> _VideoArray:
1807  """Returns an array containing all images read from a compressed video file.
1808
1809  >>> video = read_video('/tmp/river.mp4')
1810  >>> print(f'The framerate is {video.metadata.fps} frames/s.')
1811  >>> show_video(video)
1812
1813  >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
1814  >>> show_video(read_video(url))
1815
1816  Args:
1817    path_or_url: Input video file.
1818    **kwargs: Additional parameters for `VideoReader`.
1819
1820  Returns:
1821    A 4D `numpy` array with dimensions (frame, height, width, channel), or a 3D
1822    array if `output_format` is specified as 'gray'.  The returned array has an
1823    attribute `metadata` containing `VideoMetadata` information.  This enables
1824    `show_video` to retrieve the framerate in `metadata.fps`.  Note that the
1825    metadata attribute is lost in most subsequent `numpy` operations.
1826  """
1827  with VideoReader(path_or_url, **kwargs) as reader:
1828    return _VideoArray(np.array(tuple(reader)), metadata=reader.metadata)

Returns an array containing all images read from a compressed video file.

>>> video = read_video('/tmp/river.mp4')
>>> print(f'The framerate is {video.metadata.fps} frames/s.')
>>> show_video(video)
>>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
>>> show_video(read_video(url))
Arguments:
  • path_or_url: Input video file.
  • **kwargs: Additional parameters for VideoReader.
Returns:

A 4D numpy array with dimensions (frame, height, width, channel), or a 3D array if output_format is specified as 'gray'. The returned array has an attribute metadata containing VideoMetadata information. This enables show_video to retrieve the framerate in metadata.fps. Note that the metadata attribute is lost in most subsequent numpy operations.

def write_video( path: str | os.PathLike[str], images: Iterable[np.ndarray], **kwargs: Any) -> None:
1831def write_video(path: _Path, images: Iterable[_NDArray], **kwargs: Any) -> None:
1832  """Writes images to a compressed video file.
1833
1834  >>> video = moving_circle((480, 640), num_images=60)
1835  >>> write_video('/tmp/v.mp4', video, fps=60, qp=18)
1836  >>> show_video(read_video('/tmp/v.mp4'))
1837
1838  Args:
1839    path: Output video file.
1840    images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D
1841      arrays.
1842    **kwargs: Additional parameters for `VideoWriter`.
1843  """
1844  first_image, images = _peek_first(images)
1845  shape = first_image.shape[0], first_image.shape[1]
1846  dtype = first_image.dtype
1847  if dtype == bool:
1848    dtype = np.dtype(np.uint8)
1849  elif np.issubdtype(dtype, np.floating):
1850    dtype = np.dtype(np.uint16)
1851  kwargs = {'metadata': getattr(images, 'metadata', None), **kwargs}
1852  with VideoWriter(path, shape=shape, dtype=dtype, **kwargs) as writer:
1853    for image in images:
1854      writer.add_image(image)

Writes images to a compressed video file.

>>> video = moving_circle((480, 640), num_images=60)
>>> write_video('/tmp/v.mp4', video, fps=60, qp=18)
>>> show_video(read_video('/tmp/v.mp4'))
Arguments:
  • path: Output video file.
  • images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D arrays.
  • **kwargs: Additional parameters for VideoWriter.
class VideoReader(_VideoIO):
1372class VideoReader(_VideoIO):
1373  """Context to read a compressed video as an iterable over its images.
1374
1375  >>> with VideoReader('/tmp/river.mp4') as reader:
1376  ...   print(f'Video has {reader.num_images} images with shape={reader.shape},'
1377  ...         f' at {reader.fps} frames/sec and {reader.bps} bits/sec.')
1378  ...   for image in reader:
1379  ...     print(image.shape)
1380
1381  >>> with VideoReader('/tmp/river.mp4') as reader:
1382  ...   video = np.array(tuple(reader))
1383
1384  >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
1385  >>> with VideoReader(url) as reader:
1386  ...   show_video(reader)
1387
1388  Attributes:
1389    path_or_url: Location of input video.
1390    output_format: Format of output images (default 'rgb').  If 'rgb', each
1391      image has shape=(height, width, 3) with R, G, B values.  If 'yuv', each
1392      image has shape=(height, width, 3) with Y, U, V values.  If 'gray', each
1393      image has shape=(height, width).
1394    dtype: Data type for output images.  The default is `np.uint8`.  Use of
1395      `np.uint16` allows reading 10-bit or 12-bit data without precision loss.
1396    metadata: Object storing the information retrieved from the video header.
1397      Its attributes are copied as attributes in this class.
1398    num_images: Number of frames that is expected from the video stream.  This
1399      is estimated from the framerate and the duration stored in the video
1400      header, so it might be inexact.
1401    shape: The dimensions (height, width) of each video frame.
1402    fps: The framerate in frames per second.
1403    bps: The estimated bitrate of the video stream in bits per second, retrieved
1404      from the video header.
1405    stream_index: The stream index to read from. The default is 0.
1406  """
1407
1408  path_or_url: _Path
1409  output_format: str
1410  dtype: _DType
1411  metadata: VideoMetadata
1412  num_images: int
1413  shape: tuple[int, int]
1414  fps: float
1415  bps: int | None
1416  stream_index: int
1417  _num_bytes_per_image: int
1418
1419  def __init__(
1420      self,
1421      path_or_url: _Path,
1422      *,
1423      stream_index: int = 0,
1424      output_format: str = 'rgb',
1425      dtype: _DTypeLike = np.uint8,
1426  ):
1427    if output_format not in {'rgb', 'yuv', 'gray'}:
1428      raise ValueError(
1429          f'Output format {output_format} is not rgb, yuv, or gray.'
1430      )
1431    self.path_or_url = path_or_url
1432    self.output_format = output_format
1433    self.stream_index = stream_index
1434    self.dtype = np.dtype(dtype)
1435    if self.dtype.type not in (np.uint8, np.uint16):
1436      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1437    self._read_via_local_file: Any = None
1438    self._popen: subprocess.Popen[bytes] | None = None
1439    self._proc: subprocess.Popen[bytes] | None = None
1440
1441  def __enter__(self) -> 'VideoReader':
1442    try:
1443      self._read_via_local_file = _read_via_local_file(self.path_or_url)
1444      # pylint: disable-next=no-member
1445      tmp_name = self._read_via_local_file.__enter__()
1446
1447      self.metadata = _get_video_metadata(tmp_name)
1448      self.num_images, self.shape, self.fps, self.bps = self.metadata
1449      pix_fmt = self._get_pix_fmt(self.dtype, self.output_format)
1450      num_channels = {'rgb': 3, 'yuv': 3, 'gray': 1}[self.output_format]
1451      bytes_per_channel = self.dtype.itemsize
1452      self._num_bytes_per_image = (
1453          math.prod(self.shape) * num_channels * bytes_per_channel
1454      )
1455
1456      command = [
1457          '-v',
1458          'panic',
1459          '-nostdin',
1460          '-i',
1461          tmp_name,
1462          '-vcodec',
1463          'rawvideo',
1464          '-f',
1465          'image2pipe',
1466          '-map',
1467          f'0:v:{self.stream_index}',
1468          '-pix_fmt',
1469          pix_fmt,
1470          '-vsync',
1471          'vfr',
1472          '-',
1473      ]
1474      self._popen = _run_ffmpeg(
1475          command,
1476          stdout=subprocess.PIPE,
1477          stderr=subprocess.PIPE,
1478          allowed_input_files=[tmp_name],
1479      )
1480      self._proc = self._popen.__enter__()
1481    except Exception:
1482      self.__exit__(None, None, None)
1483      raise
1484    return self
1485
1486  def __exit__(self, *_: Any) -> None:
1487    self.close()
1488
1489  def read(self) -> _NDArray | None:
1490    """Reads a video image frame (or None if at end of file).
1491
1492    Returns:
1493      A numpy array in the format specified by `output_format`, i.e., a 3D
1494      array with 3 color channels, except for format 'gray' which is 2D.
1495    """
1496    assert self._proc, 'Error: reading from an already closed context.'
1497    stdout = self._proc.stdout
1498    assert stdout is not None
1499    data = stdout.read(self._num_bytes_per_image)
1500    if not data:  # Due to either end-of-file or subprocess error.
1501      self.close()  # Raises exception if subprocess had error.
1502      return None  # To indicate end-of-file.
1503    assert len(data) == self._num_bytes_per_image
1504    image = np.frombuffer(data, dtype=self.dtype)
1505    if self.output_format == 'rgb':
1506      image = image.reshape(*self.shape, 3)
1507    elif self.output_format == 'yuv':  # Convert from planar YUV to pixel YUV.
1508      image = np.moveaxis(image.reshape(3, *self.shape), 0, 2)
1509    elif self.output_format == 'gray':  # Generate 2D rather than 3D ndimage.
1510      image = image.reshape(*self.shape)
1511    else:
1512      raise AssertionError
1513    return image
1514
1515  def __iter__(self) -> Iterator[_NDArray]:
1516    while True:
1517      image = self.read()
1518      if image is None:
1519        return
1520      yield image
1521
1522  def close(self) -> None:
1523    """Terminates video reader.  (Called automatically at end of context.)"""
1524    if self._popen:
1525      self._popen.__exit__(None, None, None)
1526      self._popen = None
1527      self._proc = None
1528    if self._read_via_local_file:
1529      # pylint: disable-next=no-member
1530      self._read_via_local_file.__exit__(None, None, None)
1531      self._read_via_local_file = None

Context to read a compressed video as an iterable over its images.

>>> with VideoReader('/tmp/river.mp4') as reader:
...   print(f'Video has {reader.num_images} images with shape={reader.shape},'
...         f' at {reader.fps} frames/sec and {reader.bps} bits/sec.')
...   for image in reader:
...     print(image.shape)
>>> with VideoReader('/tmp/river.mp4') as reader:
...   video = np.array(tuple(reader))
>>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
>>> with VideoReader(url) as reader:
...   show_video(reader)
Attributes:
  • path_or_url: Location of input video.
  • output_format: Format of output images (default 'rgb'). If 'rgb', each image has shape=(height, width, 3) with R, G, B values. If 'yuv', each image has shape=(height, width, 3) with Y, U, V values. If 'gray', each image has shape=(height, width).
  • dtype: Data type for output images. The default is np.uint8. Use of np.uint16 allows reading 10-bit or 12-bit data without precision loss.
  • metadata: Object storing the information retrieved from the video header. Its attributes are copied as attributes in this class.
  • num_images: Number of frames that is expected from the video stream. This is estimated from the framerate and the duration stored in the video header, so it might be inexact.
  • shape: The dimensions (height, width) of each video frame.
  • fps: The framerate in frames per second.
  • bps: The estimated bitrate of the video stream in bits per second, retrieved from the video header.
  • stream_index: The stream index to read from. The default is 0.
VideoReader( path_or_url: str | os.PathLike[str], *, stream_index: int = 0, output_format: str = 'rgb', dtype: DTypeLike = <class 'numpy.uint8'>)
1419  def __init__(
1420      self,
1421      path_or_url: _Path,
1422      *,
1423      stream_index: int = 0,
1424      output_format: str = 'rgb',
1425      dtype: _DTypeLike = np.uint8,
1426  ):
1427    if output_format not in {'rgb', 'yuv', 'gray'}:
1428      raise ValueError(
1429          f'Output format {output_format} is not rgb, yuv, or gray.'
1430      )
1431    self.path_or_url = path_or_url
1432    self.output_format = output_format
1433    self.stream_index = stream_index
1434    self.dtype = np.dtype(dtype)
1435    if self.dtype.type not in (np.uint8, np.uint16):
1436      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1437    self._read_via_local_file: Any = None
1438    self._popen: subprocess.Popen[bytes] | None = None
1439    self._proc: subprocess.Popen[bytes] | None = None
path_or_url: str | os.PathLike[str]
output_format: str
dtype: ~_DType
metadata: VideoMetadata
num_images: int
shape: tuple[int, int]
fps: float
bps: int | None
stream_index: int
def read(self) -> np.ndarray | None:
1489  def read(self) -> _NDArray | None:
1490    """Reads a video image frame (or None if at end of file).
1491
1492    Returns:
1493      A numpy array in the format specified by `output_format`, i.e., a 3D
1494      array with 3 color channels, except for format 'gray' which is 2D.
1495    """
1496    assert self._proc, 'Error: reading from an already closed context.'
1497    stdout = self._proc.stdout
1498    assert stdout is not None
1499    data = stdout.read(self._num_bytes_per_image)
1500    if not data:  # Due to either end-of-file or subprocess error.
1501      self.close()  # Raises exception if subprocess had error.
1502      return None  # To indicate end-of-file.
1503    assert len(data) == self._num_bytes_per_image
1504    image = np.frombuffer(data, dtype=self.dtype)
1505    if self.output_format == 'rgb':
1506      image = image.reshape(*self.shape, 3)
1507    elif self.output_format == 'yuv':  # Convert from planar YUV to pixel YUV.
1508      image = np.moveaxis(image.reshape(3, *self.shape), 0, 2)
1509    elif self.output_format == 'gray':  # Generate 2D rather than 3D ndimage.
1510      image = image.reshape(*self.shape)
1511    else:
1512      raise AssertionError
1513    return image

Reads a video image frame (or None if at end of file).

Returns:

A numpy array in the format specified by output_format, i.e., a 3D array with 3 color channels, except for format 'gray' which is 2D.

def close(self) -> None:
1522  def close(self) -> None:
1523    """Terminates video reader.  (Called automatically at end of context.)"""
1524    if self._popen:
1525      self._popen.__exit__(None, None, None)
1526      self._popen = None
1527      self._proc = None
1528    if self._read_via_local_file:
1529      # pylint: disable-next=no-member
1530      self._read_via_local_file.__exit__(None, None, None)
1531      self._read_via_local_file = None

Terminates video reader. (Called automatically at end of context.)

class VideoWriter(_VideoIO):
1534class VideoWriter(_VideoIO):
1535  """Context to write a compressed video.
1536
1537  >>> shape = 480, 640
1538  >>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer:
1539  ...   for image in moving_circle(shape, num_images=60):
1540  ...     writer.add_image(image)
1541  >>> show_video(read_video('/tmp/v.mp4'))
1542
1543
1544  Bitrate control may be specified using at most one of: `bps`, `qp`, or `crf`.
1545  If none are specified, `qp` is set to a default value.
1546  See https://slhck.info/video/2017/03/01/rate-control.html
1547
1548  If codec is 'gif', the args `bps`, `qp`, `crf`, and `encoded_format` are
1549  ignored.
1550
1551  Attributes:
1552    path: Output video.  Its suffix (e.g. '.mp4') determines the video container
1553      format.  The suffix must be '.gif' if the codec is 'gif'.
1554    shape: 2D spatial dimensions (height, width) of video image frames.  The
1555      dimensions must be even if 'encoded_format' has subsampled chroma (e.g.,
1556      'yuv420p' or 'yuv420p10le').
1557    codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264',
1558      'hevc', 'vp9', or 'gif').
1559    metadata: Optional VideoMetadata object whose `fps` and `bps` attributes are
1560      used if not specified as explicit parameters.
1561    fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif').
1562    bps: Requested average bits-per-second bitrate (default None).
1563    qp: Quantization parameter for video compression quality (default None).
1564    crf: Constant rate factor for video compression quality (default None).
1565    ffmpeg_args: Additional arguments for `ffmpeg` command, e.g. '-g 30' to
1566      introduce I-frames, or '-bf 0' to omit B-frames.
1567    input_format: Format of input images (default 'rgb').  If 'rgb', each image
1568      has shape=(height, width, 3) or (height, width).  If 'yuv', each image has
1569      shape=(height, width, 3) with Y, U, V values.  If 'gray', each image has
1570      shape=(height, width).
1571    dtype: Expected data type for input images (any float input images are
1572      converted to `dtype`).  The default is `np.uint8`.  Use of `np.uint16` is
1573      necessary when encoding >8 bits/channel.
1574    encoded_format: Pixel format as defined by `ffmpeg -pix_fmts`, e.g.,
1575      'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma),
1576      'yuv420p10le' (10-bit per channel), etc.  The default (None) selects
1577      'yuv420p' if all shape dimensions are even, else 'yuv444p'.
1578  """
1579
1580  def __init__(
1581      self,
1582      path: _Path,
1583      shape: tuple[int, int],
1584      *,
1585      codec: str = 'h264',
1586      metadata: VideoMetadata | None = None,
1587      fps: float | None = None,
1588      bps: int | None = None,
1589      qp: int | None = None,
1590      crf: float | None = None,
1591      ffmpeg_args: str | Sequence[str] = '',
1592      input_format: str = 'rgb',
1593      dtype: _DTypeLike = np.uint8,
1594      encoded_format: str | None = None,
1595  ) -> None:
1596    _check_2d_shape(shape)
1597    if fps is None and metadata:
1598      fps = metadata.fps
1599    if fps is None:
1600      fps = 25.0 if codec == 'gif' else 60.0
1601    if fps <= 0.0:
1602      raise ValueError(f'Frame-per-second value {fps} is invalid.')
1603    if bps is None and metadata:
1604      bps = metadata.bps
1605    bps = int(bps) if bps is not None else None
1606    if bps is not None and bps <= 0:
1607      raise ValueError(f'Bitrate value {bps} is invalid.')
1608    if qp is not None and (not isinstance(qp, int) or qp < 0):
1609      raise ValueError(
1610          f'Quantization parameter {qp} cannot be negative. It must be a'
1611          ' non-negative integer.'
1612      )
1613    num_rate_specifications = sum(x is not None for x in (bps, qp, crf))
1614    if num_rate_specifications > 1:
1615      raise ValueError(
1616          f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).'
1617      )
1618    ffmpeg_args = (
1619        shlex.split(ffmpeg_args)
1620        if isinstance(ffmpeg_args, str)
1621        else list(ffmpeg_args)
1622    )
1623    if input_format not in {'rgb', 'yuv', 'gray'}:
1624      raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.')
1625    dtype = np.dtype(dtype)
1626    if dtype.type not in (np.uint8, np.uint16):
1627      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1628    self.path = pathlib.Path(path)
1629    self.shape = shape
1630    all_dimensions_are_even = all(dim % 2 == 0 for dim in shape)
1631    if encoded_format is None:
1632      encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p'
1633    if not all_dimensions_are_even and encoded_format.startswith(
1634        ('yuv42', 'yuvj42')
1635    ):
1636      raise ValueError(
1637          f'With encoded_format {encoded_format}, video dimensions must be'
1638          f' even, but shape is {shape}.'
1639      )
1640    self.fps = fps
1641    self.codec = codec
1642    self.bps = bps
1643    self.qp = qp
1644    self.crf = crf
1645    self.ffmpeg_args = ffmpeg_args
1646    self.input_format = input_format
1647    self.dtype = dtype
1648    self.encoded_format = encoded_format
1649    if num_rate_specifications == 0 and not ffmpeg_args:
1650      qp = 20 if math.prod(self.shape) <= 640 * 480 else 28
1651    self._bitrate_args = (
1652        (['-vb', f'{bps}'] if bps is not None else [])
1653        + (['-qp', f'{qp}'] if qp is not None else [])
1654        + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else [])
1655    )
1656    if self.codec == 'gif':
1657      if self.path.suffix != '.gif':
1658        raise ValueError(f"File '{self.path}' does not have a .gif suffix.")
1659      self.encoded_format = 'pal8'
1660      self._bitrate_args = []
1661      video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse'
1662      # Less common (and likely less useful) is a per-frame color palette:
1663      # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];'
1664      #                 '[s1][p]paletteuse=new=1')
1665      self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args
1666    self._write_via_local_file: Any = None
1667    self._popen: subprocess.Popen[bytes] | None = None
1668    self._proc: subprocess.Popen[bytes] | None = None
1669
1670  def __enter__(self) -> 'VideoWriter':
1671    input_pix_fmt = self._get_pix_fmt(self.dtype, self.input_format)
1672    try:
1673      self._write_via_local_file = _write_via_local_file(self.path)
1674      # pylint: disable-next=no-member
1675      tmp_name = self._write_via_local_file.__enter__()
1676
1677      # Writing to stdout using ('-f', 'mp4', '-') would require
1678      # ('-movflags', 'frag_keyframe+empty_moov') and the result is nonportable.
1679      height, width = self.shape
1680      command = (
1681          [
1682              '-v',
1683              'error',
1684              '-f',
1685              'rawvideo',
1686              '-vcodec',
1687              'rawvideo',
1688              '-pix_fmt',
1689              input_pix_fmt,
1690              '-s',
1691              f'{width}x{height}',
1692              '-r',
1693              f'{self.fps}',
1694              '-i',
1695              '-',
1696              '-an',
1697              '-vcodec',
1698              self.codec,
1699              '-pix_fmt',
1700              self.encoded_format,
1701          ]
1702          + self._bitrate_args
1703          + self.ffmpeg_args
1704          + ['-y', tmp_name]
1705      )
1706      self._popen = _run_ffmpeg(
1707          command,
1708          stdin=subprocess.PIPE,
1709          stderr=subprocess.PIPE,
1710          allowed_output_files=[tmp_name],
1711      )
1712      self._proc = self._popen.__enter__()
1713    except Exception:
1714      self.__exit__(None, None, None)
1715      raise
1716    return self
1717
1718  def __exit__(self, *_: Any) -> None:
1719    self.close()
1720
1721  def add_image(self, image: _NDArray) -> None:
1722    """Writes a video frame.
1723
1724    Args:
1725      image: Array whose dtype and first two dimensions must match the `dtype`
1726        and `shape` specified in `VideoWriter` initialization.  If
1727        `input_format` is 'gray', the image must be 2D.  For the 'rgb'
1728        input_format, the image may be either 2D (interpreted as grayscale) or
1729        3D with three (R, G, B) channels.  For the 'yuv' input_format, the image
1730        must be 3D with three (Y, U, V) channels.
1731
1732    Raises:
1733      RuntimeError: If there is an error writing to the output file.
1734    """
1735    assert self._proc, 'Error: writing to an already closed context.'
1736    if issubclass(image.dtype.type, (np.floating, np.bool_)):
1737      image = to_type(image, self.dtype)
1738    if image.dtype != self.dtype:
1739      raise ValueError(f'Image type {image.dtype} != {self.dtype}.')
1740    if self.input_format == 'gray':
1741      if image.ndim != 2:
1742        raise ValueError(f'Image dimensions {image.shape} are not 2D.')
1743    else:
1744      if image.ndim == 2 and self.input_format == 'rgb':
1745        image = np.dstack((image, image, image))
1746      if not (image.ndim == 3 and image.shape[2] == 3):
1747        raise ValueError(f'Image dimensions {image.shape} are invalid.')
1748    if image.shape[:2] != self.shape:
1749      raise ValueError(
1750          f'Image dimensions {image.shape[:2]} do not match'
1751          f' those of the initialized video {self.shape}.'
1752      )
1753    if self.input_format == 'yuv':  # Convert from per-pixel YUV to planar YUV.
1754      image = np.moveaxis(image, 2, 0)
1755    data = image.tobytes()
1756    stdin = self._proc.stdin
1757    assert stdin is not None
1758    if stdin.write(data) != len(data):
1759      self._proc.wait()
1760      stderr = self._proc.stderr
1761      assert stderr is not None
1762      s = stderr.read().decode('utf-8')
1763      raise RuntimeError(f"Error writing '{self.path}': {s}")
1764
1765  def close(self) -> None:
1766    """Finishes writing the video.  (Called automatically at end of context.)"""
1767    if self._popen:
1768      assert self._proc, 'Error: closing an already closed context.'
1769      stdin = self._proc.stdin
1770      assert stdin is not None
1771      stdin.close()
1772      if self._proc.wait():
1773        stderr = self._proc.stderr
1774        assert stderr is not None
1775        s = stderr.read().decode('utf-8')
1776        raise RuntimeError(f"Error writing '{self.path}': {s}")
1777      self._popen.__exit__(None, None, None)
1778      self._popen = None
1779      self._proc = None
1780    if self._write_via_local_file:
1781      # pylint: disable-next=no-member
1782      self._write_via_local_file.__exit__(None, None, None)
1783      self._write_via_local_file = None

Context to write a compressed video.

>>> shape = 480, 640
>>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer:
...   for image in moving_circle(shape, num_images=60):
...     writer.add_image(image)
>>> show_video(read_video('/tmp/v.mp4'))

Bitrate control may be specified using at most one of: bps, qp, or crf. If none are specified, qp is set to a default value. See https://slhck.info/video/2017/03/01/rate-control.html

If codec is 'gif', the args bps, qp, crf, and encoded_format are ignored.

Attributes:
  • path: Output video. Its suffix (e.g. '.mp4') determines the video container format. The suffix must be '.gif' if the codec is 'gif'.
  • shape: 2D spatial dimensions (height, width) of video image frames. The dimensions must be even if 'encoded_format' has subsampled chroma (e.g., 'yuv420p' or 'yuv420p10le').
  • codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264', 'hevc', 'vp9', or 'gif').
  • metadata: Optional VideoMetadata object whose fps and bps attributes are used if not specified as explicit parameters.
  • fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif').
  • bps: Requested average bits-per-second bitrate (default None).
  • qp: Quantization parameter for video compression quality (default None).
  • crf: Constant rate factor for video compression quality (default None).
  • ffmpeg_args: Additional arguments for ffmpeg command, e.g. '-g 30' to introduce I-frames, or '-bf 0' to omit B-frames.
  • input_format: Format of input images (default 'rgb'). If 'rgb', each image has shape=(height, width, 3) or (height, width). If 'yuv', each image has shape=(height, width, 3) with Y, U, V values. If 'gray', each image has shape=(height, width).
  • dtype: Expected data type for input images (any float input images are converted to dtype). The default is np.uint8. Use of np.uint16 is necessary when encoding >8 bits/channel.
  • encoded_format: Pixel format as defined by ffmpeg -pix_fmts, e.g., 'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma), 'yuv420p10le' (10-bit per channel), etc. The default (None) selects 'yuv420p' if all shape dimensions are even, else 'yuv444p'.
VideoWriter( path: str | os.PathLike[str], shape: tuple[int, int], *, codec: str = 'h264', metadata: VideoMetadata | None = None, fps: float | None = None, bps: int | None = None, qp: int | None = None, crf: float | None = None, ffmpeg_args: str | Sequence[str] = '', input_format: str = 'rgb', dtype: DTypeLike = <class 'numpy.uint8'>, encoded_format: str | None = None)
1580  def __init__(
1581      self,
1582      path: _Path,
1583      shape: tuple[int, int],
1584      *,
1585      codec: str = 'h264',
1586      metadata: VideoMetadata | None = None,
1587      fps: float | None = None,
1588      bps: int | None = None,
1589      qp: int | None = None,
1590      crf: float | None = None,
1591      ffmpeg_args: str | Sequence[str] = '',
1592      input_format: str = 'rgb',
1593      dtype: _DTypeLike = np.uint8,
1594      encoded_format: str | None = None,
1595  ) -> None:
1596    _check_2d_shape(shape)
1597    if fps is None and metadata:
1598      fps = metadata.fps
1599    if fps is None:
1600      fps = 25.0 if codec == 'gif' else 60.0
1601    if fps <= 0.0:
1602      raise ValueError(f'Frame-per-second value {fps} is invalid.')
1603    if bps is None and metadata:
1604      bps = metadata.bps
1605    bps = int(bps) if bps is not None else None
1606    if bps is not None and bps <= 0:
1607      raise ValueError(f'Bitrate value {bps} is invalid.')
1608    if qp is not None and (not isinstance(qp, int) or qp < 0):
1609      raise ValueError(
1610          f'Quantization parameter {qp} cannot be negative. It must be a'
1611          ' non-negative integer.'
1612      )
1613    num_rate_specifications = sum(x is not None for x in (bps, qp, crf))
1614    if num_rate_specifications > 1:
1615      raise ValueError(
1616          f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).'
1617      )
1618    ffmpeg_args = (
1619        shlex.split(ffmpeg_args)
1620        if isinstance(ffmpeg_args, str)
1621        else list(ffmpeg_args)
1622    )
1623    if input_format not in {'rgb', 'yuv', 'gray'}:
1624      raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.')
1625    dtype = np.dtype(dtype)
1626    if dtype.type not in (np.uint8, np.uint16):
1627      raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.')
1628    self.path = pathlib.Path(path)
1629    self.shape = shape
1630    all_dimensions_are_even = all(dim % 2 == 0 for dim in shape)
1631    if encoded_format is None:
1632      encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p'
1633    if not all_dimensions_are_even and encoded_format.startswith(
1634        ('yuv42', 'yuvj42')
1635    ):
1636      raise ValueError(
1637          f'With encoded_format {encoded_format}, video dimensions must be'
1638          f' even, but shape is {shape}.'
1639      )
1640    self.fps = fps
1641    self.codec = codec
1642    self.bps = bps
1643    self.qp = qp
1644    self.crf = crf
1645    self.ffmpeg_args = ffmpeg_args
1646    self.input_format = input_format
1647    self.dtype = dtype
1648    self.encoded_format = encoded_format
1649    if num_rate_specifications == 0 and not ffmpeg_args:
1650      qp = 20 if math.prod(self.shape) <= 640 * 480 else 28
1651    self._bitrate_args = (
1652        (['-vb', f'{bps}'] if bps is not None else [])
1653        + (['-qp', f'{qp}'] if qp is not None else [])
1654        + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else [])
1655    )
1656    if self.codec == 'gif':
1657      if self.path.suffix != '.gif':
1658        raise ValueError(f"File '{self.path}' does not have a .gif suffix.")
1659      self.encoded_format = 'pal8'
1660      self._bitrate_args = []
1661      video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse'
1662      # Less common (and likely less useful) is a per-frame color palette:
1663      # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];'
1664      #                 '[s1][p]paletteuse=new=1')
1665      self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args
1666    self._write_via_local_file: Any = None
1667    self._popen: subprocess.Popen[bytes] | None = None
1668    self._proc: subprocess.Popen[bytes] | None = None
path
shape
fps
codec
bps
qp
crf
ffmpeg_args
input_format
dtype
encoded_format
def add_image(self, image: np.ndarray) -> None:
1721  def add_image(self, image: _NDArray) -> None:
1722    """Writes a video frame.
1723
1724    Args:
1725      image: Array whose dtype and first two dimensions must match the `dtype`
1726        and `shape` specified in `VideoWriter` initialization.  If
1727        `input_format` is 'gray', the image must be 2D.  For the 'rgb'
1728        input_format, the image may be either 2D (interpreted as grayscale) or
1729        3D with three (R, G, B) channels.  For the 'yuv' input_format, the image
1730        must be 3D with three (Y, U, V) channels.
1731
1732    Raises:
1733      RuntimeError: If there is an error writing to the output file.
1734    """
1735    assert self._proc, 'Error: writing to an already closed context.'
1736    if issubclass(image.dtype.type, (np.floating, np.bool_)):
1737      image = to_type(image, self.dtype)
1738    if image.dtype != self.dtype:
1739      raise ValueError(f'Image type {image.dtype} != {self.dtype}.')
1740    if self.input_format == 'gray':
1741      if image.ndim != 2:
1742        raise ValueError(f'Image dimensions {image.shape} are not 2D.')
1743    else:
1744      if image.ndim == 2 and self.input_format == 'rgb':
1745        image = np.dstack((image, image, image))
1746      if not (image.ndim == 3 and image.shape[2] == 3):
1747        raise ValueError(f'Image dimensions {image.shape} are invalid.')
1748    if image.shape[:2] != self.shape:
1749      raise ValueError(
1750          f'Image dimensions {image.shape[:2]} do not match'
1751          f' those of the initialized video {self.shape}.'
1752      )
1753    if self.input_format == 'yuv':  # Convert from per-pixel YUV to planar YUV.
1754      image = np.moveaxis(image, 2, 0)
1755    data = image.tobytes()
1756    stdin = self._proc.stdin
1757    assert stdin is not None
1758    if stdin.write(data) != len(data):
1759      self._proc.wait()
1760      stderr = self._proc.stderr
1761      assert stderr is not None
1762      s = stderr.read().decode('utf-8')
1763      raise RuntimeError(f"Error writing '{self.path}': {s}")

Writes a video frame.

Arguments:
  • image: Array whose dtype and first two dimensions must match the dtype and shape specified in VideoWriter initialization. If input_format is 'gray', the image must be 2D. For the 'rgb' input_format, the image may be either 2D (interpreted as grayscale) or 3D with three (R, G, B) channels. For the 'yuv' input_format, the image must be 3D with three (Y, U, V) channels.
Raises:
  • RuntimeError: If there is an error writing to the output file.
def close(self) -> None:
1765  def close(self) -> None:
1766    """Finishes writing the video.  (Called automatically at end of context.)"""
1767    if self._popen:
1768      assert self._proc, 'Error: closing an already closed context.'
1769      stdin = self._proc.stdin
1770      assert stdin is not None
1771      stdin.close()
1772      if self._proc.wait():
1773        stderr = self._proc.stderr
1774        assert stderr is not None
1775        s = stderr.read().decode('utf-8')
1776        raise RuntimeError(f"Error writing '{self.path}': {s}")
1777      self._popen.__exit__(None, None, None)
1778      self._popen = None
1779      self._proc = None
1780    if self._write_via_local_file:
1781      # pylint: disable-next=no-member
1782      self._write_via_local_file.__exit__(None, None, None)
1783      self._write_via_local_file = None

Finishes writing the video. (Called automatically at end of context.)

class VideoMetadata(typing.NamedTuple):
1267class VideoMetadata(NamedTuple):
1268  """Represents the data stored in a video container header.
1269
1270  Attributes:
1271    num_images: Number of frames that is expected from the video stream.  This
1272      is estimated from the framerate and the duration stored in the video
1273      header, so it might be inexact.  We set the value to -1 if number of
1274      frames is not found in the header.
1275    shape: The dimensions (height, width) of each video frame.
1276    fps: The framerate in frames per second.
1277    bps: The estimated bitrate of the video stream in bits per second, retrieved
1278      from the video header.
1279  """
1280
1281  num_images: int
1282  shape: tuple[int, int]
1283  fps: float
1284  bps: int | None

Represents the data stored in a video container header.

Attributes:
  • num_images: Number of frames that is expected from the video stream. This is estimated from the framerate and the duration stored in the video header, so it might be inexact. We set the value to -1 if number of frames is not found in the header.
  • shape: The dimensions (height, width) of each video frame.
  • fps: The framerate in frames per second.
  • bps: The estimated bitrate of the video stream in bits per second, retrieved from the video header.
def compress_image(image: ArrayLike, *, fmt: str = 'png', **kwargs: Any) -> bytes:
859def compress_image(
860    image: _ArrayLike, *, fmt: str = 'png', **kwargs: Any
861) -> bytes:
862  """Returns a buffer containing a compressed image.
863
864  Args:
865    image: Array in a format supported by `PIL`, e.g. np.uint8 or np.uint16.
866    fmt: Desired compression encoding, e.g. 'png'.
867    **kwargs: Options for `PIL.save()`, e.g. `optimize=True` for greater
868      compression.
869  """
870  image = _as_valid_media_array(image)
871  with io.BytesIO() as output:
872    _pil_image(image).save(output, format=fmt, **kwargs)
873    return output.getvalue()

Returns a buffer containing a compressed image.

Arguments:
  • image: Array in a format supported by PIL, e.g. np.uint8 or np.uint16.
  • fmt: Desired compression encoding, e.g. 'png'.
  • **kwargs: Options for PIL.save(), e.g. optimize=True for greater compression.
def decompress_image( data: bytes, dtype: DTypeLike = None, apply_exif_transpose: bool = True) -> np.ndarray:
876def decompress_image(
877    data: bytes, dtype: _DTypeLike = None, apply_exif_transpose: bool = True
878) -> _NDArray:
879  """Returns an image from a compressed data buffer.
880
881  Decoding is performed using `PIL`, which supports `uint8` images with 1, 3,
882  or 4 channels and `uint16` images with a single channel.
883
884  Args:
885    data: Buffer containing compressed image.
886    dtype: Data type of the returned array.  If None, `np.uint8` or `np.uint16`
887      is inferred automatically.
888    apply_exif_transpose: If True, rotate image according to EXIF orientation.
889  """
890  pil_image: PIL.Image.Image = PIL.Image.open(io.BytesIO(data))
891  if apply_exif_transpose:
892    tmp_image = PIL.ImageOps.exif_transpose(pil_image)  # Future: in_place=True.
893    assert tmp_image
894    pil_image = tmp_image
895  if dtype is None:
896    dtype = np.uint16 if pil_image.mode.startswith('I') else np.uint8
897  return np.array(pil_image, dtype=dtype)

Returns an image from a compressed data buffer.

Decoding is performed using PIL, which supports uint8 images with 1, 3, or 4 channels and uint16 images with a single channel.

Arguments:
  • data: Buffer containing compressed image.
  • dtype: Data type of the returned array. If None, np.uint8 or np.uint16 is inferred automatically.
  • apply_exif_transpose: If True, rotate image according to EXIF orientation.
def compress_video( images: Iterable[np.ndarray], *, codec: str = 'h264', **kwargs: Any) -> bytes:
1857def compress_video(
1858    images: Iterable[_NDArray], *, codec: str = 'h264', **kwargs: Any
1859) -> bytes:
1860  """Returns a buffer containing a compressed video.
1861
1862  The video container is 'gif' for 'gif' codec, 'webm' for 'vp9' codec,
1863  and mp4 otherwise.
1864
1865  >>> video = read_video('/tmp/river.mp4')
1866  >>> data = compress_video(video, bps=10_000_000)
1867  >>> print(len(data))
1868
1869  >>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10)
1870
1871  Args:
1872    images: Iterable over video frames.
1873    codec: Compression algorithm as defined by `ffmpeg -codecs` (e.g., 'h264',
1874      'hevc', 'vp9', or 'gif').
1875    **kwargs: Additional parameters for `VideoWriter`.
1876
1877  Returns:
1878    A bytes buffer containing the compressed video.
1879  """
1880  suffix = _filename_suffix_from_codec(codec)
1881  with tempfile.TemporaryDirectory() as directory_name:
1882    tmp_path = pathlib.Path(directory_name) / f'file{suffix}'
1883    write_video(tmp_path, images, codec=codec, **kwargs)
1884    return tmp_path.read_bytes()

Returns a buffer containing a compressed video.

The video container is 'gif' for 'gif' codec, 'webm' for 'vp9' codec, and mp4 otherwise.

>>> video = read_video('/tmp/river.mp4')
>>> data = compress_video(video, bps=10_000_000)
>>> print(len(data))
>>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10)
Arguments:
  • images: Iterable over video frames.
  • codec: Compression algorithm as defined by ffmpeg -codecs (e.g., 'h264', 'hevc', 'vp9', or 'gif').
  • **kwargs: Additional parameters for VideoWriter.
Returns:

A bytes buffer containing the compressed video.

def decompress_video(data: bytes, **kwargs: Any) -> np.ndarray:
1887def decompress_video(data: bytes, **kwargs: Any) -> _NDArray:
1888  """Returns video images from an MP4-compressed data buffer."""
1889  with tempfile.TemporaryDirectory() as directory_name:
1890    tmp_path = pathlib.Path(directory_name) / 'file.mp4'
1891    tmp_path.write_bytes(data)
1892    return read_video(tmp_path, **kwargs)

Returns video images from an MP4-compressed data buffer.

def html_from_compressed_image( data: bytes, width: int, height: int, *, title: str | None = None, border: bool | str = False, pixelated: bool = True, fmt: str = 'png') -> str:
900def html_from_compressed_image(
901    data: bytes,
902    width: int,
903    height: int,
904    *,
905    title: str | None = None,
906    border: bool | str = False,
907    pixelated: bool = True,
908    fmt: str = 'png',
909) -> str:
910  """Returns an HTML string with an image tag containing encoded data.
911
912  Args:
913    data: Compressed image bytes.
914    width: Width of HTML image in pixels.
915    height: Height of HTML image in pixels.
916    title: Optional text shown centered above image.
917    border: If `bool`, whether to place a black boundary around the image, or if
918      `str`, the boundary CSS style.
919    pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'.
920    fmt: Compression encoding.
921  """
922  b64 = base64.b64encode(data).decode('utf-8')
923  if isinstance(border, str):
924    border = f'{border}; '
925  elif border:
926    border = 'border:1px solid black; '
927  else:
928    border = ''
929  s_pixelated = 'pixelated' if pixelated else 'auto'
930  s = (
931      f'<img width="{width}" height="{height}"'
932      f' style="{border}image-rendering:{s_pixelated}; object-fit:cover;"'
933      f' src="data:image/{fmt};base64,{b64}"/>'
934  )
935  if title is not None:
936    s = f"""<div style="display:flex; align-items:left;">
937      <div style="display:flex; flex-direction:column; align-items:center;">
938      <div>{title}</div><div>{s}</div></div></div>"""
939  return s

Returns an HTML string with an image tag containing encoded data.

Arguments:
  • data: Compressed image bytes.
  • width: Width of HTML image in pixels.
  • height: Height of HTML image in pixels.
  • title: Optional text shown centered above image.
  • border: If bool, whether to place a black boundary around the image, or if str, the boundary CSS style.
  • pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'.
  • fmt: Compression encoding.
def html_from_compressed_video( data: bytes, width: int, height: int, *, title: str | None = None, border: bool | str = False, loop: bool = True, autoplay: bool = True) -> str:
1895def html_from_compressed_video(
1896    data: bytes,
1897    width: int,
1898    height: int,
1899    *,
1900    title: str | None = None,
1901    border: bool | str = False,
1902    loop: bool = True,
1903    autoplay: bool = True,
1904) -> str:
1905  """Returns an HTML string with a video tag containing H264-encoded data.
1906
1907  Args:
1908    data: MP4-compressed video bytes.
1909    width: Width of HTML video in pixels.
1910    height: Height of HTML video in pixels.
1911    title: Optional text shown centered above the video.
1912    border: If `bool`, whether to place a black boundary around the image, or if
1913      `str`, the boundary CSS style.
1914    loop: If True, the playback repeats forever.
1915    autoplay: If True, video playback starts without having to click.
1916  """
1917  b64 = base64.b64encode(data).decode('utf-8')
1918  if isinstance(border, str):
1919    border = f'{border}; '
1920  elif border:
1921    border = 'border:1px solid black; '
1922  else:
1923    border = ''
1924  options = (
1925      f'controls width="{width}" height="{height}"'
1926      f' style="{border}object-fit:cover;"'
1927      f'{" loop" if loop else ""}'
1928      f'{" autoplay muted" if autoplay else ""}'
1929  )
1930  s = f"""<video {options}>
1931      <source src="data:video/mp4;base64,{b64}" type="video/mp4"/>
1932      This browser does not support the video tag.
1933      </video>"""
1934  if title is not None:
1935    s = f"""<div style="display:flex; align-items:left;">
1936      <div style="display:flex; flex-direction:column; align-items:center;">
1937      <div>{title}</div><div>{s}</div></div></div>"""
1938  return s

Returns an HTML string with a video tag containing H264-encoded data.

Arguments:
  • data: MP4-compressed video bytes.
  • width: Width of HTML video in pixels.
  • height: Height of HTML video in pixels.
  • title: Optional text shown centered above the video.
  • border: If bool, whether to place a black boundary around the image, or if str, the boundary CSS style.
  • loop: If True, the playback repeats forever.
  • autoplay: If True, video playback starts without having to click.
def resize_image(image: ArrayLike, shape: tuple[int, int]) -> np.ndarray:
615def resize_image(image: _ArrayLike, shape: tuple[int, int]) -> _NDArray:
616  """Resizes image to specified spatial dimensions using a Lanczos filter.
617
618  Args:
619    image: Array-like 2D or 3D object, where dtype is uint or floating-point.
620    shape: 2D spatial dimensions (height, width) of output image.
621
622  Returns:
623    A resampled image whose spatial dimensions match `shape`.
624  """
625  image = _as_valid_media_array(image)
626  if image.ndim not in (2, 3):
627    raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.')
628  _check_2d_shape(shape)
629
630  # A PIL image can be multichannel only if it has 3 or 4 uint8 channels,
631  # and it can be resized only if it is uint8 or float32.
632  supported_single_channel = (
633      np.issubdtype(image.dtype, np.floating) or image.dtype == np.uint8
634  ) and image.ndim == 2
635  supported_multichannel = (
636      image.dtype == np.uint8 and image.ndim == 3 and image.shape[2] in (3, 4)
637  )
638  if supported_single_channel or supported_multichannel:
639    return np.array(
640        _pil_image(image).resize(
641            shape[::-1], resample=PIL.Image.Resampling.LANCZOS
642        ),
643        dtype=image.dtype,
644    )
645  if image.ndim == 2:
646    # We convert to floating-point for resizing and convert back.
647    return to_type(resize_image(to_float01(image), shape), image.dtype)
648  # We resize each image channel individually.
649  return np.dstack(
650      [resize_image(channel, shape) for channel in np.moveaxis(image, -1, 0)]
651  )

Resizes image to specified spatial dimensions using a Lanczos filter.

Arguments:
  • image: Array-like 2D or 3D object, where dtype is uint or floating-point.
  • shape: 2D spatial dimensions (height, width) of output image.
Returns:

A resampled image whose spatial dimensions match shape.

def resize_video(video: Iterable[np.ndarray], shape: tuple[int, int]) -> np.ndarray:
657def resize_video(video: Iterable[_NDArray], shape: tuple[int, int]) -> _NDArray:
658  """Resizes `video` to specified spatial dimensions using a Lanczos filter.
659
660  Args:
661    video: Iterable of images.
662    shape: 2D spatial dimensions (height, width) of output video.
663
664  Returns:
665    A resampled video whose spatial dimensions match `shape`.
666  """
667  _check_2d_shape(shape)
668  return np.array([resize_image(image, shape) for image in video])

Resizes video to specified spatial dimensions using a Lanczos filter.

Arguments:
  • video: Iterable of images.
  • shape: 2D spatial dimensions (height, width) of output video.
Returns:

A resampled video whose spatial dimensions match shape.

def to_rgb( array: ArrayLike, *, vmin: float | None = None, vmax: float | None = None, cmap: str | Callable[[ArrayLike], np.ndarray] = 'gray') -> np.ndarray:
815def to_rgb(
816    array: _ArrayLike,
817    *,
818    vmin: float | None = None,
819    vmax: float | None = None,
820    cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray',
821) -> _NDArray:
822  """Maps scalar values to RGB using value bounds and a color map.
823
824  Args:
825    array: Scalar values, with arbitrary shape.
826    vmin: Explicit min value for remapping; if None, it is obtained as the
827      minimum finite value of `array`.
828    vmax: Explicit max value for remapping; if None, it is obtained as the
829      maximum finite value of `array`.
830    cmap: A `pyplot` color map or callable, to map from 1D value to 3D or 4D
831      color.
832
833  Returns:
834    A new array in which each element is affinely mapped from [vmin, vmax]
835    to [0.0, 1.0] and then color-mapped.
836  """
837  a = _as_valid_media_array(array)
838  del array
839  # For future numpy version 1.7.0:
840  # vmin = np.amin(a, where=np.isfinite(a)) if vmin is None else vmin
841  # vmax = np.amax(a, where=np.isfinite(a)) if vmax is None else vmax
842  vmin = np.amin(np.where(np.isfinite(a), a, np.inf)) if vmin is None else vmin
843  vmax = np.amax(np.where(np.isfinite(a), a, -np.inf)) if vmax is None else vmax
844  a = (a.astype('float') - vmin) / (vmax - vmin + np.finfo(float).eps)
845  if isinstance(cmap, str):
846    if hasattr(matplotlib, 'colormaps'):
847      rgb_from_scalar: Any = matplotlib.colormaps[cmap]  # Newer version.
848    else:
849      rgb_from_scalar = matplotlib.pyplot.cm.get_cmap(cmap)  # pylint: disable=no-member
850  else:
851    rgb_from_scalar = cmap
852  a = cast(_NDArray, rgb_from_scalar(a))
853  # If there is a fully opaque alpha channel, remove it.
854  if a.shape[-1] == 4 and np.all(to_float01(a[..., 3])) == 1.0:
855    a = a[..., :3]
856  return a

Maps scalar values to RGB using value bounds and a color map.

Arguments:
  • array: Scalar values, with arbitrary shape.
  • vmin: Explicit min value for remapping; if None, it is obtained as the minimum finite value of array.
  • vmax: Explicit max value for remapping; if None, it is obtained as the maximum finite value of array.
  • cmap: A pyplot color map or callable, to map from 1D value to 3D or 4D color.
Returns:

A new array in which each element is affinely mapped from [vmin, vmax] to [0.0, 1.0] and then color-mapped.

def to_type(array: ArrayLike, dtype: DTypeLike) -> np.ndarray:
376def to_type(array: _ArrayLike, dtype: _DTypeLike) -> _NDArray:
377  """Returns media array converted to specified type.
378
379  A "media array" is one in which the dtype is either a floating-point type
380  (np.float32 or np.float64) or an unsigned integer type.  The array values are
381  assumed to lie in the range [0.0, 1.0] for floating-point values, and in the
382  full range for unsigned integers, e.g. [0, 255] for np.uint8.
383
384  Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to
385  1.0.  The input array may also be of type bool, whereby True maps to
386  uint(MAX) or 1.0.  The values are scaled and clamped as appropriate during
387  type conversions.
388
389  Args:
390    array: Input array-like object (floating-point, unsigned int, or bool).
391    dtype: Desired output type (floating-point or unsigned int).
392
393  Returns:
394    Array `a` if it is already of the specified dtype, else a converted array.
395  """
396  a = np.asarray(array)
397  dtype = np.dtype(dtype)
398  del array
399  if a.dtype != bool:
400    _as_valid_media_type(a.dtype)  # Verify that 'a' has a valid dtype.
401  if a.dtype == bool:
402    result = a.astype(dtype)
403    if np.issubdtype(dtype, np.unsignedinteger):
404      result = result * dtype.type(np.iinfo(dtype).max)
405  elif a.dtype == dtype:
406    result = a
407  elif np.issubdtype(dtype, np.unsignedinteger):
408    if np.issubdtype(a.dtype, np.unsignedinteger):
409      src_max: float = np.iinfo(a.dtype).max
410    else:
411      a = np.clip(a, 0.0, 1.0)
412      src_max = 1.0
413    dst_max = np.iinfo(dtype).max
414    if dst_max <= np.iinfo(np.uint16).max:
415      scale = np.array(dst_max / src_max, dtype=np.float32)
416      result = (a * scale + 0.5).astype(dtype)
417    elif dst_max <= np.iinfo(np.uint32).max:
418      result = (a.astype(np.float64) * (dst_max / src_max) + 0.5).astype(dtype)
419    else:
420      # https://stackoverflow.com/a/66306123/
421      a = a.astype(np.float64) * (dst_max / src_max) + 0.5
422      dst = np.atleast_1d(a)
423      values_too_large = dst >= np.float64(dst_max)
424      with np.errstate(invalid='ignore'):
425        dst = dst.astype(dtype)
426      dst[values_too_large] = dst_max
427      result = dst if a.ndim > 0 else dst[0]
428  else:
429    assert np.issubdtype(dtype, np.floating)
430    result = a.astype(dtype)
431    if np.issubdtype(a.dtype, np.unsignedinteger):
432      result = result / dtype.type(np.iinfo(a.dtype).max)
433  return result

Returns media array converted to specified type.

A "media array" is one in which the dtype is either a floating-point type (np.float32 or np.float64) or an unsigned integer type. The array values are assumed to lie in the range [0.0, 1.0] for floating-point values, and in the full range for unsigned integers, e.g. [0, 255] for np.uint8.

Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to 1.0. The input array may also be of type bool, whereby True maps to uint(MAX) or 1.0. The values are scaled and clamped as appropriate during type conversions.

Arguments:
  • array: Input array-like object (floating-point, unsigned int, or bool).
  • dtype: Desired output type (floating-point or unsigned int).
Returns:

Array a if it is already of the specified dtype, else a converted array.

def to_float01( a: ArrayLike, dtype: DTypeLike = <class 'numpy.float32'>) -> np.ndarray:
436def to_float01(a: _ArrayLike, dtype: _DTypeLike = np.float32) -> _NDArray:
437  """If array has unsigned integers, rescales them to the range [0.0, 1.0].
438
439  Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0.  See
440  `to_type`.
441
442  Args:
443    a: Input array.
444    dtype: Desired floating-point type if rescaling occurs.
445
446  Returns:
447    A new array of dtype values in the range [0.0, 1.0] if the input array `a`
448    contains unsigned integers; otherwise, array `a` is returned unchanged.
449  """
450  a = np.asarray(a)
451  dtype = np.dtype(dtype)
452  if not np.issubdtype(dtype, np.floating):
453    raise ValueError(f'Type {dtype} is not floating-point.')
454  if np.issubdtype(a.dtype, np.floating):
455    return a
456  return to_type(a, dtype)

If array has unsigned integers, rescales them to the range [0.0, 1.0].

Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0. See to_type.

Arguments:
  • a: Input array.
  • dtype: Desired floating-point type if rescaling occurs.
Returns:

A new array of dtype values in the range [0.0, 1.0] if the input array a contains unsigned integers; otherwise, array a is returned unchanged.

def to_uint8(a: ArrayLike) -> np.ndarray:
459def to_uint8(a: _ArrayLike) -> _NDArray:
460  """Returns array converted to uint8 values; see `to_type`."""
461  return to_type(a, np.uint8)

Returns array converted to uint8 values; see to_type.

def set_output_height(num_pixels: int) -> None:
329def set_output_height(num_pixels: int) -> None:
330  """Overrides the height of the current output cell, if using Colab."""
331  try:
332    # We want to fail gracefully for non-Colab IPython notebooks.
333    output = importlib.import_module('google.colab.output')
334    s = f'google.colab.output.setIframeHeight("{num_pixels}px")'
335    output.eval_js(s)
336  except (ModuleNotFoundError, AttributeError):
337    pass

Overrides the height of the current output cell, if using Colab.

def set_max_output_height(num_pixels: int) -> None:
340def set_max_output_height(num_pixels: int) -> None:
341  """Sets the maximum height of the current output cell, if using Colab."""
342  try:
343    # We want to fail gracefully for non-Colab IPython notebooks.
344    output = importlib.import_module('google.colab.output')
345    s = (
346        'google.colab.output.setIframeHeight('
347        f'0, true, {{maxHeight: {num_pixels}}})'
348    )
349    output.eval_js(s)
350  except (ModuleNotFoundError, AttributeError):
351    pass

Sets the maximum height of the current output cell, if using Colab.

def color_ramp( shape: tuple[int, int] = (64, 64), *, dtype: DTypeLike = <class 'numpy.float32'>) -> np.ndarray:
467def color_ramp(
468    shape: tuple[int, int] = (64, 64), *, dtype: _DTypeLike = np.float32
469) -> _NDArray:
470  """Returns an image of a red-green color gradient.
471
472  This is useful for quick experimentation and testing.  See also
473  `moving_circle` to generate a sample video.
474
475  Args:
476    shape: 2D spatial dimensions (height, width) of generated image.
477    dtype: Type (uint or floating) of resulting pixel values.
478  """
479  _check_2d_shape(shape)
480  dtype = _as_valid_media_type(dtype)
481  yx = (np.moveaxis(np.indices(shape), 0, -1) + 0.5) / shape
482  image = np.insert(yx, 2, 0.0, axis=-1)
483  return to_type(image, dtype)

Returns an image of a red-green color gradient.

This is useful for quick experimentation and testing. See also moving_circle to generate a sample video.

Arguments:
  • shape: 2D spatial dimensions (height, width) of generated image.
  • dtype: Type (uint or floating) of resulting pixel values.
def moving_circle( shape: tuple[int, int] = (256, 256), num_images: int = 10, *, dtype: DTypeLike = <class 'numpy.float32'>) -> np.ndarray:
486def moving_circle(
487    shape: tuple[int, int] = (256, 256),
488    num_images: int = 10,
489    *,
490    dtype: _DTypeLike = np.float32,
491) -> _NDArray:
492  """Returns a video of a circle moving in front of a color ramp.
493
494  This is useful for quick experimentation and testing.  See also `color_ramp`
495  to generate a sample image.
496
497  >>> show_video(moving_circle((480, 640), 60), fps=60)
498
499  Args:
500    shape: 2D spatial dimensions (height, width) of generated video.
501    num_images: Number of video frames.
502    dtype: Type (uint or floating) of resulting pixel values.
503  """
504  _check_2d_shape(shape)
505  dtype = np.dtype(dtype)
506
507  def generate_image(image_index: int) -> _NDArray:
508    """Returns a video frame image."""
509    image = color_ramp(shape, dtype=dtype)
510    yx = np.moveaxis(np.indices(shape), 0, -1)
511    center = shape[0] * 0.6, shape[1] * (image_index + 0.5) / num_images
512    radius_squared = (min(shape) * 0.1) ** 2
513    inside = np.sum((yx - center) ** 2, axis=-1) < radius_squared
514    white_circle_color = 1.0, 1.0, 1.0
515    if np.issubdtype(dtype, np.unsignedinteger):
516      white_circle_color = to_type([white_circle_color], dtype)[0]
517    image[inside] = white_circle_color
518    return image
519
520  return np.array([generate_image(i) for i in range(num_images)])

Returns a video of a circle moving in front of a color ramp.

This is useful for quick experimentation and testing. See also color_ramp to generate a sample image.

>>> show_video(moving_circle((480, 640), 60), fps=60)
Arguments:
  • shape: 2D spatial dimensions (height, width) of generated video.
  • num_images: Number of video frames.
  • dtype: Type (uint or floating) of resulting pixel values.
class set_show_save_dir:
736class set_show_save_dir:  # pylint: disable=invalid-name
737  """Save all titled output from `show_*()` calls into files.
738
739  If the specified `directory` is not None, all titled images and videos
740  displayed by `show_image`, `show_images`, `show_video`, and `show_videos` are
741  also saved as files within the directory.
742
743  It can be used either to set the state or as a context manager:
744
745  >>> set_show_save_dir('/tmp')
746  >>> show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
747  >>> show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
748  >>> set_show_save_dir(None)
749
750  >>> with set_show_save_dir('/tmp'):
751  ...   show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
752  ...   show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
753  """
754
755  def __init__(self, directory: _Path | None):
756    self._old_show_save_dir = _config.show_save_dir
757    _config.show_save_dir = directory
758
759  def __enter__(self) -> None:
760    pass
761
762  def __exit__(self, *_: Any) -> None:
763    _config.show_save_dir = self._old_show_save_dir

Save all titled output from show_*() calls into files.

If the specified directory is not None, all titled images and videos displayed by show_image, show_images, show_video, and show_videos are also saved as files within the directory.

It can be used either to set the state or as a context manager:

>>> set_show_save_dir('/tmp')
>>> show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
>>> show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
>>> set_show_save_dir(None)
>>> with set_show_save_dir('/tmp'):
...   show_image(color_ramp(), title='image1')  # Creates /tmp/image1.png.
...   show_video(moving_circle(), title='video2')  # Creates /tmp/video2.mp4.
set_show_save_dir(directory: str | os.PathLike[str] | None)
755  def __init__(self, directory: _Path | None):
756    self._old_show_save_dir = _config.show_save_dir
757    _config.show_save_dir = directory
def set_ffmpeg(name_or_path: str | os.PathLike[str]) -> None:
315def set_ffmpeg(name_or_path: _Path) -> None:
316  """Specifies the name or path for the `ffmpeg` external program.
317
318  The `ffmpeg` program is required for compressing and decompressing video.
319  (It is used in `read_video`, `write_video`, `show_video`, `show_videos`,
320  etc.)
321
322  Args:
323    name_or_path: Either a filename within a directory of `os.environ['PATH']`
324      or a filepath.  The default setting is 'ffmpeg'.
325  """
326  _config.ffmpeg_name_or_path = name_or_path

Specifies the name or path for the ffmpeg external program.

The ffmpeg program is required for compressing and decompressing video. (It is used in read_video, write_video, show_video, show_videos, etc.)

Arguments:
  • name_or_path: Either a filename within a directory of os.environ['PATH'] or a filepath. The default setting is 'ffmpeg'.
def video_is_available() -> bool:
1259def video_is_available() -> bool:
1260  """Returns True if the program `ffmpeg` is found.
1261
1262  See also `set_ffmpeg`.
1263  """
1264  return _search_for_ffmpeg_path() is not None

Returns True if the program ffmpeg is found.

See also set_ffmpeg.