mediapy
mediapy
: Read/write/show images and videos in an IPython/Jupyter notebook.
[GitHub source] [API docs] [PyPI package] [Colab example]
See the example notebook, or better yet, open it in Colab.
Image examples
Display an image (2D or 3D numpy
array):
checkerboard = np.kron([[0, 1] * 16, [1, 0] * 16] * 16, np.ones((4, 4)))
show_image(checkerboard)
Read and display an image (either local or from the Web):
IMAGE = 'https://github.com/hhoppe/data/raw/main/image.png'
show_image(read_image(IMAGE))
Read and display an image from a local file:
!wget -q -O /tmp/burano.png {IMAGE}
show_image(read_image('/tmp/burano.png'))
Show titled images side-by-side:
images = {
'original': checkerboard,
'darkened': checkerboard * 0.7,
'random': np.random.rand(32, 32, 3),
}
show_images(images, vmin=0.0, vmax=1.0, border=True, height=64)
Compare two images using an interactive slider:
compare_images([checkerboard, np.random.rand(128, 128, 3)])
Video examples
Display a video (an iterable of images, e.g., a 3D or 4D array):
video = moving_circle((100, 100), num_images=10)
show_video(video, fps=10)
Show the video frames side-by-side:
show_images(video, columns=6, border=True, height=64)
Show the frames with their indices:
show_images({f'{i}': image for i, image in enumerate(video)}, width=32)
Read and display a video (either local or from the Web):
VIDEO = 'https://github.com/hhoppe/data/raw/main/video.mp4'
show_video(read_video(VIDEO))
Create and display a looping two-frame GIF video:
image1 = resize_image(np.random.rand(10, 10, 3), (50, 50))
show_video([image1, image1 * 0.8], fps=2, codec='gif')
Darken a video frame-by-frame:
output_path = '/tmp/out.mp4'
with VideoReader(VIDEO) as r:
darken_image = lambda image: to_float01(image) * 0.5
with VideoWriter(output_path, shape=r.shape, fps=r.fps, bps=r.bps) as w:
for image in r:
w.add_image(darken_image(image))
1# Copyright 2024 The mediapy Authors. 2# 3# Licensed under the Apache License, Version 2.0 (the "License"); 4# you may not use this file except in compliance with the License. 5# You may obtain a copy of the License at 6# 7# http://www.apache.org/licenses/LICENSE-2.0 8# 9# Unless required by applicable law or agreed to in writing, software 10# distributed under the License is distributed on an "AS IS" BASIS, 11# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12# See the License for the specific language governing permissions and 13# limitations under the License. 14 15"""`mediapy`: Read/write/show images and videos in an IPython/Jupyter notebook. 16 17[**[GitHub source]**](https://github.com/google/mediapy) 18[**[API docs]**](https://google.github.io/mediapy/) 19[**[PyPI package]**](https://pypi.org/project/mediapy/) 20[**[Colab 21example]**](https://colab.research.google.com/github/google/mediapy/blob/main/mediapy_examples.ipynb) 22 23See the [example 24notebook](https://github.com/google/mediapy/blob/main/mediapy_examples.ipynb), 25or better yet, [**open it in 26Colab**](https://colab.research.google.com/github/google/mediapy/blob/main/mediapy_examples.ipynb). 27 28## Image examples 29 30Display an image (2D or 3D `numpy` array): 31```python 32checkerboard = np.kron([[0, 1] * 16, [1, 0] * 16] * 16, np.ones((4, 4))) 33show_image(checkerboard) 34``` 35 36Read and display an image (either local or from the Web): 37```python 38IMAGE = 'https://github.com/hhoppe/data/raw/main/image.png' 39show_image(read_image(IMAGE)) 40``` 41 42Read and display an image from a local file: 43```python 44!wget -q -O /tmp/burano.png {IMAGE} 45show_image(read_image('/tmp/burano.png')) 46``` 47 48Show titled images side-by-side: 49```python 50images = { 51 'original': checkerboard, 52 'darkened': checkerboard * 0.7, 53 'random': np.random.rand(32, 32, 3), 54} 55show_images(images, vmin=0.0, vmax=1.0, border=True, height=64) 56``` 57 58Compare two images using an interactive slider: 59```python 60compare_images([checkerboard, np.random.rand(128, 128, 3)]) 61``` 62 63## Video examples 64 65Display a video (an iterable of images, e.g., a 3D or 4D array): 66```python 67video = moving_circle((100, 100), num_images=10) 68show_video(video, fps=10) 69``` 70 71Show the video frames side-by-side: 72```python 73show_images(video, columns=6, border=True, height=64) 74``` 75 76Show the frames with their indices: 77```python 78show_images({f'{i}': image for i, image in enumerate(video)}, width=32) 79``` 80 81Read and display a video (either local or from the Web): 82```python 83VIDEO = 'https://github.com/hhoppe/data/raw/main/video.mp4' 84show_video(read_video(VIDEO)) 85``` 86 87Create and display a looping two-frame GIF video: 88```python 89image1 = resize_image(np.random.rand(10, 10, 3), (50, 50)) 90show_video([image1, image1 * 0.8], fps=2, codec='gif') 91``` 92 93Darken a video frame-by-frame: 94```python 95output_path = '/tmp/out.mp4' 96with VideoReader(VIDEO) as r: 97 darken_image = lambda image: to_float01(image) * 0.5 98 with VideoWriter(output_path, shape=r.shape, fps=r.fps, bps=r.bps) as w: 99 for image in r: 100 w.add_image(darken_image(image)) 101``` 102""" 103 104from __future__ import annotations 105 106__docformat__ = 'google' 107__version__ = '1.2.2' 108__version_info__ = tuple(int(num) for num in __version__.split('.')) 109 110import base64 111from collections.abc import Callable, Iterable, Iterator, Mapping, Sequence 112import contextlib 113import functools 114import importlib 115import io 116import itertools 117import math 118import numbers 119import os # Package only needed for typing.TYPE_CHECKING. 120import pathlib 121import re 122import shlex 123import shutil 124import subprocess 125import sys 126import tempfile 127import typing 128from typing import Any 129import urllib.request 130 131import IPython.display 132import matplotlib.pyplot 133import numpy as np 134import numpy.typing as npt 135import PIL.Image 136import PIL.ImageOps 137 138 139if not hasattr(PIL.Image, 'Resampling'): # Allow Pillow<9.0. 140 PIL.Image.Resampling = PIL.Image # type: ignore 141 142# Selected and reordered here for pdoc documentation. 143__all__ = [ 144 'show_image', 145 'show_images', 146 'compare_images', 147 'show_video', 148 'show_videos', 149 'read_image', 150 'write_image', 151 'read_video', 152 'write_video', 153 'VideoReader', 154 'VideoWriter', 155 'VideoMetadata', 156 'compress_image', 157 'decompress_image', 158 'compress_video', 159 'decompress_video', 160 'html_from_compressed_image', 161 'html_from_compressed_video', 162 'resize_image', 163 'resize_video', 164 'to_rgb', 165 'to_type', 166 'to_float01', 167 'to_uint8', 168 'set_output_height', 169 'set_max_output_height', 170 'color_ramp', 171 'moving_circle', 172 'set_show_save_dir', 173 'set_ffmpeg', 174 'video_is_available', 175] 176 177if TYPE_CHECKING: 178 _ArrayLike = npt.ArrayLike 179 _DTypeLike = npt.DTypeLike 180 _NDArray = np.ndarray[Any, Any] 181 _DType = np.dtype[Any] 182else: 183 # Create named types for use in the `pdoc` documentation. 184 _ArrayLike = TypeVar('_ArrayLike') 185 _DTypeLike = TypeVar('_DTypeLike') 186 _NDArray = TypeVar('_NDArray') 187 _DType = TypeVar('_DType') # pylint: disable=invalid-name 188 189_IPYTHON_HTML_SIZE_LIMIT = 20_000_000 190_T = TypeVar('_T') 191_Path = Union[str, 'os.PathLike[str]'] 192 193_IMAGE_COMPARISON_HTML = """\ 194<script 195 defer 196 src="https://unpkg.com/img-comparison-slider@7/dist/index.js" 197></script> 198<link 199 rel="stylesheet" 200 href="https://unpkg.com/img-comparison-slider@7/dist/styles.css" 201/> 202 203<img-comparison-slider> 204 <img slot="first" src="data:image/png;base64,{b64_1}" /> 205 <img slot="second" src="data:image/png;base64,{b64_2}" /> 206</img-comparison-slider> 207""" 208 209# ** Miscellaneous. 210 211 212class _Config: 213 ffmpeg_name_or_path: _Path = 'ffmpeg' 214 show_save_dir: _Path | None = None 215 216 217_config = _Config() 218 219 220def _open(path: _Path, *args: Any, **kwargs: Any) -> Any: 221 """Opens the file; this is a hook for the built-in `open()`.""" 222 return open(path, *args, **kwargs) 223 224 225def _path_is_local(path: _Path) -> bool: 226 """Returns True if the path is in the filesystem accessible by `ffmpeg`.""" 227 del path 228 return True 229 230 231def _search_for_ffmpeg_path() -> str | None: 232 """Returns a path to the ffmpeg program, or None if not found.""" 233 if filename := shutil.which(_config.ffmpeg_name_or_path): 234 return str(filename) 235 return None 236 237 238def _print_err(*args: str, **kwargs: Any) -> None: 239 """Prints arguments to stderr immediately.""" 240 kwargs = {**dict(file=sys.stderr, flush=True), **kwargs} 241 print(*args, **kwargs) 242 243 244def _chunked( 245 iterable: Iterable[_T], n: int | None = None 246) -> Iterator[tuple[_T, ...]]: 247 """Returns elements collected as tuples of length at most `n` if not None.""" 248 249 def take(n: int, iterable: Iterable[_T]) -> tuple[_T, ...]: 250 return tuple(itertools.islice(iterable, n)) 251 252 return iter(functools.partial(take, n, iter(iterable)), ()) 253 254 255def _peek_first(iterator: Iterable[_T]) -> tuple[_T, Iterable[_T]]: 256 """Given an iterator, returns first element and re-initialized iterator. 257 258 >>> first_image, images = _peek_first(moving_circle()) 259 260 Args: 261 iterator: An input iterator or iterable. 262 263 Returns: 264 A tuple (first_element, iterator_reinitialized) containing: 265 first_element: The first element of the input. 266 iterator_reinitialized: A clone of the original iterator/iterable. 267 """ 268 # Inspired from https://stackoverflow.com/a/12059829/1190077 269 peeker, iterator_reinitialized = itertools.tee(iterator) 270 first = next(peeker) 271 return first, iterator_reinitialized 272 273 274def _check_2d_shape(shape: tuple[int, int]) -> None: 275 """Checks that `shape` is of the form (height, width) with two integers.""" 276 if len(shape) != 2: 277 raise ValueError(f'Shape {shape} is not of the form (height, width).') 278 if not all(isinstance(i, numbers.Integral) for i in shape): 279 raise ValueError(f'Shape {shape} contains non-integers.') 280 281 282def _run(args: str | Sequence[str]) -> None: 283 """Executes command, printing output from stdout and stderr. 284 285 Args: 286 args: Command to execute, which can be either a string or a sequence of word 287 strings, as in `subprocess.run()`. If `args` is a string, the shell is 288 invoked to interpret it. 289 290 Raises: 291 RuntimeError: If the command's exit code is nonzero. 292 """ 293 proc = subprocess.run( 294 args, 295 shell=isinstance(args, str), 296 stdout=subprocess.PIPE, 297 stderr=subprocess.STDOUT, 298 check=False, 299 universal_newlines=True, 300 ) 301 print(proc.stdout, end='', flush=True) 302 if proc.returncode: 303 raise RuntimeError( 304 f"Command '{proc.args}' failed with code {proc.returncode}." 305 ) 306 307 308def _display_html(text: str, /) -> None: 309 """In a Jupyter notebook, display the HTML `text`.""" 310 IPython.display.display(IPython.display.HTML(text)) # type: ignore 311 312 313def set_ffmpeg(name_or_path: _Path) -> None: 314 """Specifies the name or path for the `ffmpeg` external program. 315 316 The `ffmpeg` program is required for compressing and decompressing video. 317 (It is used in `read_video`, `write_video`, `show_video`, `show_videos`, 318 etc.) 319 320 Args: 321 name_or_path: Either a filename within a directory of `os.environ['PATH']` 322 or a filepath. The default setting is 'ffmpeg'. 323 """ 324 _config.ffmpeg_name_or_path = name_or_path 325 326 327def set_output_height(num_pixels: int) -> None: 328 """Overrides the height of the current output cell, if using Colab.""" 329 try: 330 # We want to fail gracefully for non-Colab IPython notebooks. 331 output = importlib.import_module('google.colab.output') 332 s = f'google.colab.output.setIframeHeight("{num_pixels}px")' 333 output.eval_js(s) 334 except (ModuleNotFoundError, AttributeError): 335 pass 336 337 338def set_max_output_height(num_pixels: int) -> None: 339 """Sets the maximum height of the current output cell, if using Colab.""" 340 try: 341 # We want to fail gracefully for non-Colab IPython notebooks. 342 output = importlib.import_module('google.colab.output') 343 s = ( 344 'google.colab.output.setIframeHeight(' 345 f'0, true, {{maxHeight: {num_pixels}}})' 346 ) 347 output.eval_js(s) 348 except (ModuleNotFoundError, AttributeError): 349 pass 350 351 352# ** Type conversions. 353 354 355def _as_valid_media_type(dtype: _DTypeLike) -> _DType: 356 """Returns validated media data type.""" 357 dtype = np.dtype(dtype) 358 if not issubclass(dtype.type, (np.unsignedinteger, np.floating)): 359 raise ValueError( 360 f'Type {dtype} is not a valid media data type (uint or float).' 361 ) 362 return dtype 363 364 365def _as_valid_media_array(x: _ArrayLike) -> _NDArray: 366 """Converts to ndarray (if not already), and checks validity of data type.""" 367 a = np.asarray(x) 368 if a.dtype == bool: 369 a = a.astype(np.uint8) * np.iinfo(np.uint8).max 370 _as_valid_media_type(a.dtype) 371 return a 372 373 374def to_type(array: _ArrayLike, dtype: _DTypeLike) -> _NDArray: 375 """Returns media array converted to specified type. 376 377 A "media array" is one in which the dtype is either a floating-point type 378 (np.float32 or np.float64) or an unsigned integer type. The array values are 379 assumed to lie in the range [0.0, 1.0] for floating-point values, and in the 380 full range for unsigned integers, e.g. [0, 255] for np.uint8. 381 382 Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to 383 1.0. The input array may also be of type bool, whereby True maps to 384 uint(MAX) or 1.0. The values are scaled and clamped as appropriate during 385 type conversions. 386 387 Args: 388 array: Input array-like object (floating-point, unsigned int, or bool). 389 dtype: Desired output type (floating-point or unsigned int). 390 391 Returns: 392 Array `a` if it is already of the specified dtype, else a converted array. 393 """ 394 a = np.asarray(array) 395 dtype = np.dtype(dtype) 396 del array 397 if a.dtype != bool: 398 _as_valid_media_type(a.dtype) # Verify that 'a' has a valid dtype. 399 if a.dtype == bool: 400 result = a.astype(dtype) 401 if np.issubdtype(dtype, np.unsignedinteger): 402 result = result * dtype.type(np.iinfo(dtype).max) 403 elif a.dtype == dtype: 404 result = a 405 elif np.issubdtype(dtype, np.unsignedinteger): 406 if np.issubdtype(a.dtype, np.unsignedinteger): 407 src_max: float = np.iinfo(a.dtype).max 408 else: 409 a = np.clip(a, 0.0, 1.0) 410 src_max = 1.0 411 dst_max = np.iinfo(dtype).max 412 if dst_max <= np.iinfo(np.uint16).max: 413 scale = np.array(dst_max / src_max, dtype=np.float32) 414 result = (a * scale + 0.5).astype(dtype) 415 elif dst_max <= np.iinfo(np.uint32).max: 416 result = (a.astype(np.float64) * (dst_max / src_max) + 0.5).astype(dtype) 417 else: 418 # https://stackoverflow.com/a/66306123/ 419 a = a.astype(np.float64) * (dst_max / src_max) + 0.5 420 dst = np.atleast_1d(a) 421 values_too_large = dst >= np.float64(dst_max) 422 with np.errstate(invalid='ignore'): 423 dst = dst.astype(dtype) 424 dst[values_too_large] = dst_max 425 result = dst if a.ndim > 0 else dst[0] 426 else: 427 assert np.issubdtype(dtype, np.floating) 428 result = a.astype(dtype) 429 if np.issubdtype(a.dtype, np.unsignedinteger): 430 result = result / dtype.type(np.iinfo(a.dtype).max) 431 return result 432 433 434def to_float01(a: _ArrayLike, dtype: _DTypeLike = np.float32) -> _NDArray: 435 """If array has unsigned integers, rescales them to the range [0.0, 1.0]. 436 437 Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0. See 438 `to_type`. 439 440 Args: 441 a: Input array. 442 dtype: Desired floating-point type if rescaling occurs. 443 444 Returns: 445 A new array of dtype values in the range [0.0, 1.0] if the input array `a` 446 contains unsigned integers; otherwise, array `a` is returned unchanged. 447 """ 448 a = np.asarray(a) 449 dtype = np.dtype(dtype) 450 if not np.issubdtype(dtype, np.floating): 451 raise ValueError(f'Type {dtype} is not floating-point.') 452 if np.issubdtype(a.dtype, np.floating): 453 return a 454 return to_type(a, dtype) 455 456 457def to_uint8(a: _ArrayLike) -> _NDArray: 458 """Returns array converted to uint8 values; see `to_type`.""" 459 return to_type(a, np.uint8) 460 461 462# ** Functions to generate example image and video data. 463 464 465def color_ramp( 466 shape: tuple[int, int] = (64, 64), *, dtype: _DTypeLike = np.float32 467) -> _NDArray: 468 """Returns an image of a red-green color gradient. 469 470 This is useful for quick experimentation and testing. See also 471 `moving_circle` to generate a sample video. 472 473 Args: 474 shape: 2D spatial dimensions (height, width) of generated image. 475 dtype: Type (uint or floating) of resulting pixel values. 476 """ 477 _check_2d_shape(shape) 478 dtype = _as_valid_media_type(dtype) 479 yx = (np.moveaxis(np.indices(shape), 0, -1) + 0.5) / shape 480 image = np.insert(yx, 2, 0.0, axis=-1) 481 return to_type(image, dtype) 482 483 484def moving_circle( 485 shape: tuple[int, int] = (256, 256), 486 num_images: int = 10, 487 *, 488 dtype: _DTypeLike = np.float32, 489) -> _NDArray: 490 """Returns a video of a circle moving in front of a color ramp. 491 492 This is useful for quick experimentation and testing. See also `color_ramp` 493 to generate a sample image. 494 495 >>> show_video(moving_circle((480, 640), 60), fps=60) 496 497 Args: 498 shape: 2D spatial dimensions (height, width) of generated video. 499 num_images: Number of video frames. 500 dtype: Type (uint or floating) of resulting pixel values. 501 """ 502 _check_2d_shape(shape) 503 dtype = np.dtype(dtype) 504 505 def generate_image(image_index: int) -> _NDArray: 506 """Returns a video frame image.""" 507 image = color_ramp(shape, dtype=dtype) 508 yx = np.moveaxis(np.indices(shape), 0, -1) 509 center = shape[0] * 0.6, shape[1] * (image_index + 0.5) / num_images 510 radius_squared = (min(shape) * 0.1) ** 2 511 inside = np.sum((yx - center) ** 2, axis=-1) < radius_squared 512 white_circle_color = 1.0, 1.0, 1.0 513 if np.issubdtype(dtype, np.unsignedinteger): 514 white_circle_color = to_type([white_circle_color], dtype)[0] 515 image[inside] = white_circle_color 516 return image 517 518 return np.array([generate_image(i) for i in range(num_images)]) 519 520 521# ** Color-space conversions. 522 523# Same matrix values as in two sources: 524# https://github.com/scikit-image/scikit-image/blob/master/skimage/color/colorconv.py#L377 525# https://github.com/tensorflow/tensorflow/blob/r1.14/tensorflow/python/ops/image_ops_impl.py#L2754 526_YUV_FROM_RGB_MATRIX = np.array( 527 [ 528 [0.299, -0.14714119, 0.61497538], 529 [0.587, -0.28886916, -0.51496512], 530 [0.114, 0.43601035, -0.10001026], 531 ], 532 dtype=np.float32, 533) 534_RGB_FROM_YUV_MATRIX = np.linalg.inv(_YUV_FROM_RGB_MATRIX) 535_YUV_CHROMA_OFFSET = np.array([0.0, 0.5, 0.5], dtype=np.float32) 536 537 538def yuv_from_rgb(rgb: _ArrayLike) -> _NDArray: 539 """Returns the RGB image/video mapped to YUV [0,1] color space. 540 541 Note that the "YUV" color space used by video compressors is actually YCbCr! 542 543 Args: 544 rgb: Input image in sRGB space. 545 """ 546 rgb = to_float01(rgb) 547 if rgb.shape[-1] != 3: 548 raise ValueError(f'The last dimension in {rgb.shape} is not 3.') 549 return rgb @ _YUV_FROM_RGB_MATRIX + _YUV_CHROMA_OFFSET 550 551 552def rgb_from_yuv(yuv: _ArrayLike) -> _NDArray: 553 """Returns the YUV image/video mapped to RGB [0,1] color space.""" 554 yuv = to_float01(yuv) 555 if yuv.shape[-1] != 3: 556 raise ValueError(f'The last dimension in {yuv.shape} is not 3.') 557 return (yuv - _YUV_CHROMA_OFFSET) @ _RGB_FROM_YUV_MATRIX 558 559 560# Same matrix values as in 561# https://github.com/scikit-image/scikit-image/blob/master/skimage/color/colorconv.py#L1654 562# and https://en.wikipedia.org/wiki/YUV#Studio_swing_for_BT.601 563_YCBCR_FROM_RGB_MATRIX = np.array( 564 [ 565 [65.481, 128.553, 24.966], 566 [-37.797, -74.203, 112.0], 567 [112.0, -93.786, -18.214], 568 ], 569 dtype=np.float32, 570).transpose() 571_RGB_FROM_YCBCR_MATRIX = np.linalg.inv(_YCBCR_FROM_RGB_MATRIX) 572_YCBCR_OFFSET = np.array([16.0, 128.0, 128.0], dtype=np.float32) 573# Note that _YCBCR_FROM_RGB_MATRIX =~ _YUV_FROM_RGB_MATRIX * [219, 256, 182]; 574# https://en.wikipedia.org/wiki/YUV: "Y' values are conventionally shifted and 575# scaled to the range [16, 235] (referred to as studio swing or 'TV levels')"; 576# "studio range of 16-240 for U and V". (Where does value 182 come from?) 577 578 579def ycbcr_from_rgb(rgb: _ArrayLike) -> _NDArray: 580 """Returns the RGB image/video mapped to YCbCr [0,1] color space. 581 582 The YCbCr color space is the one called "YUV" by video compressors. 583 584 Args: 585 rgb: Input image in sRGB space. 586 """ 587 rgb = to_float01(rgb) 588 if rgb.shape[-1] != 3: 589 raise ValueError(f'The last dimension in {rgb.shape} is not 3.') 590 return (rgb @ _YCBCR_FROM_RGB_MATRIX + _YCBCR_OFFSET) / 255.0 591 592 593def rgb_from_ycbcr(ycbcr: _ArrayLike) -> _NDArray: 594 """Returns the YCbCr image/video mapped to RGB [0,1] color space.""" 595 ycbcr = to_float01(ycbcr) 596 if ycbcr.shape[-1] != 3: 597 raise ValueError(f'The last dimension in {ycbcr.shape} is not 3.') 598 return (ycbcr * 255.0 - _YCBCR_OFFSET) @ _RGB_FROM_YCBCR_MATRIX 599 600 601# ** Image processing. 602 603 604def _pil_image(image: _ArrayLike, mode: str | None = None) -> PIL.Image.Image: 605 """Returns a PIL image given a numpy matrix (either uint8 or float [0,1]).""" 606 image = _as_valid_media_array(image) 607 if image.ndim not in (2, 3): 608 raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.') 609 pil_image: PIL.Image.Image = PIL.Image.fromarray(image, mode=mode) # type: ignore[no-untyped-call] 610 return pil_image 611 612 613def resize_image(image: _ArrayLike, shape: tuple[int, int]) -> _NDArray: 614 """Resizes image to specified spatial dimensions using a Lanczos filter. 615 616 Args: 617 image: Array-like 2D or 3D object, where dtype is uint or floating-point. 618 shape: 2D spatial dimensions (height, width) of output image. 619 620 Returns: 621 A resampled image whose spatial dimensions match `shape`. 622 """ 623 image = _as_valid_media_array(image) 624 if image.ndim not in (2, 3): 625 raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.') 626 _check_2d_shape(shape) 627 628 # A PIL image can be multichannel only if it has 3 or 4 uint8 channels, 629 # and it can be resized only if it is uint8 or float32. 630 supported_single_channel = ( 631 np.issubdtype(image.dtype, np.floating) or image.dtype == np.uint8 632 ) and image.ndim == 2 633 supported_multichannel = ( 634 image.dtype == np.uint8 and image.ndim == 3 and image.shape[2] in (3, 4) 635 ) 636 if supported_single_channel or supported_multichannel: 637 return np.array( 638 _pil_image(image).resize( 639 shape[::-1], resample=PIL.Image.Resampling.LANCZOS 640 ), 641 dtype=image.dtype, 642 ) 643 if image.ndim == 2: 644 # We convert to floating-point for resizing and convert back. 645 return to_type(resize_image(to_float01(image), shape), image.dtype) 646 # We resize each image channel individually. 647 return np.dstack( 648 [resize_image(channel, shape) for channel in np.moveaxis(image, -1, 0)] 649 ) 650 651 652# ** Video processing. 653 654 655def resize_video(video: Iterable[_NDArray], shape: tuple[int, int]) -> _NDArray: 656 """Resizes `video` to specified spatial dimensions using a Lanczos filter. 657 658 Args: 659 video: Iterable of images. 660 shape: 2D spatial dimensions (height, width) of output video. 661 662 Returns: 663 A resampled video whose spatial dimensions match `shape`. 664 """ 665 _check_2d_shape(shape) 666 return np.array([resize_image(image, shape) for image in video]) 667 668 669# ** General I/O. 670 671 672def _is_url(path_or_url: _Path) -> bool: 673 return isinstance(path_or_url, str) and path_or_url.startswith( 674 ('http://', 'https://', 'file://') 675 ) 676 677 678def read_contents(path_or_url: _Path) -> bytes: 679 """Returns the contents of the file specified by either a path or URL.""" 680 data: bytes 681 if _is_url(path_or_url): 682 assert isinstance(path_or_url, str) 683 with urllib.request.urlopen(path_or_url) as response: 684 data = response.read() 685 else: 686 with _open(path_or_url, 'rb') as f: 687 data = f.read() 688 return data 689 690 691@contextlib.contextmanager 692def _read_via_local_file(path_or_url: _Path) -> Iterator[str]: 693 """Context to copy a remote file locally to read from it. 694 695 Args: 696 path_or_url: File, which may be remote. 697 698 Yields: 699 The name of a local file which may be a copy of a remote file. 700 """ 701 if _is_url(path_or_url) or not _path_is_local(path_or_url): 702 suffix = pathlib.Path(path_or_url).suffix 703 with tempfile.TemporaryDirectory() as directory_name: 704 tmp_path = pathlib.Path(directory_name) / f'file{suffix}' 705 tmp_path.write_bytes(read_contents(path_or_url)) 706 yield str(tmp_path) 707 else: 708 yield str(path_or_url) 709 710 711@contextlib.contextmanager 712def _write_via_local_file(path: _Path) -> Iterator[str]: 713 """Context to write a temporary local file and subsequently copy it remotely. 714 715 Args: 716 path: File, which may be remote. 717 718 Yields: 719 The name of a local file which may be subsequently copied remotely. 720 """ 721 if _path_is_local(path): 722 yield str(path) 723 else: 724 suffix = pathlib.Path(path).suffix 725 with tempfile.TemporaryDirectory() as directory_name: 726 tmp_path = pathlib.Path(directory_name) / f'file{suffix}' 727 yield str(tmp_path) 728 with _open(path, mode='wb') as f: 729 f.write(tmp_path.read_bytes()) 730 731 732class set_show_save_dir: # pylint: disable=invalid-name 733 """Save all titled output from `show_*()` calls into files. 734 735 If the specified `directory` is not None, all titled images and videos 736 displayed by `show_image`, `show_images`, `show_video`, and `show_videos` are 737 also saved as files within the directory. 738 739 It can be used either to set the state or as a context manager: 740 741 >>> set_show_save_dir('/tmp') 742 >>> show_image(color_ramp(), title='image1') # Creates /tmp/image1.png. 743 >>> show_video(moving_circle(), title='video2') # Creates /tmp/video2.mp4. 744 >>> set_show_save_dir(None) 745 746 >>> with set_show_save_dir('/tmp'): 747 ... show_image(color_ramp(), title='image1') # Creates /tmp/image1.png. 748 ... show_video(moving_circle(), title='video2') # Creates /tmp/video2.mp4. 749 """ 750 751 def __init__(self, directory: _Path | None): 752 self._old_show_save_dir = _config.show_save_dir 753 _config.show_save_dir = directory 754 755 def __enter__(self) -> None: 756 pass 757 758 def __exit__(self, *_: Any) -> None: 759 _config.show_save_dir = self._old_show_save_dir 760 761 762# ** Image I/O. 763 764 765def read_image( 766 path_or_url: _Path, 767 *, 768 apply_exif_transpose: bool = True, 769 dtype: _DTypeLike = None, 770) -> _NDArray: 771 """Returns an image read from a file path or URL. 772 773 Decoding is performed using `PIL`, which supports `uint8` images with 1, 3, 774 or 4 channels and `uint16` images with a single channel. 775 776 Args: 777 path_or_url: Path of input file. 778 apply_exif_transpose: If True, rotate image according to EXIF orientation. 779 dtype: Data type of the returned array. If None, `np.uint8` or `np.uint16` 780 is inferred automatically. 781 """ 782 data = read_contents(path_or_url) 783 return decompress_image(data, dtype, apply_exif_transpose) 784 785 786def write_image( 787 path: _Path, image: _ArrayLike, fmt: str = 'png', **kwargs: Any 788) -> None: 789 """Writes an image to a file. 790 791 Encoding is performed using `PIL`, which supports `uint8` images with 1, 3, 792 or 4 channels and `uint16` images with a single channel. 793 794 File format is explicitly provided by `fmt` and not inferred by `path`. 795 796 Args: 797 path: Path of output file. 798 image: Array-like object. If its type is float, it is converted to np.uint8 799 using `to_uint8` (thus clamping to the input to the range [0.0, 1.0]). 800 Otherwise it must be np.uint8 or np.uint16. 801 fmt: Desired compression encoding, e.g. 'png'. 802 **kwargs: Additional parameters for `PIL.Image.save()`. 803 """ 804 image = _as_valid_media_array(image) 805 if np.issubdtype(image.dtype, np.floating): 806 image = to_uint8(image) 807 with _open(path, 'wb') as f: 808 _pil_image(image).save(f, format=fmt, **kwargs) 809 810 811def to_rgb( 812 array: _ArrayLike, 813 *, 814 vmin: float | None = None, 815 vmax: float | None = None, 816 cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray', 817) -> _NDArray: 818 """Maps scalar values to RGB using value bounds and a color map. 819 820 Args: 821 array: Scalar values, with arbitrary shape. 822 vmin: Explicit min value for remapping; if None, it is obtained as the 823 minimum finite value of `array`. 824 vmax: Explicit max value for remapping; if None, it is obtained as the 825 maximum finite value of `array`. 826 cmap: A `pyplot` color map or callable, to map from 1D value to 3D or 4D 827 color. 828 829 Returns: 830 A new array in which each element is affinely mapped from [vmin, vmax] 831 to [0.0, 1.0] and then color-mapped. 832 """ 833 a = _as_valid_media_array(array) 834 del array 835 # For future numpy version 1.7.0: 836 # vmin = np.amin(a, where=np.isfinite(a)) if vmin is None else vmin 837 # vmax = np.amax(a, where=np.isfinite(a)) if vmax is None else vmax 838 vmin = np.amin(np.where(np.isfinite(a), a, np.inf)) if vmin is None else vmin 839 vmax = np.amax(np.where(np.isfinite(a), a, -np.inf)) if vmax is None else vmax 840 a = (a.astype('float') - vmin) / (vmax - vmin + np.finfo(float).eps) 841 if isinstance(cmap, str): 842 if hasattr(matplotlib, 'colormaps'): 843 rgb_from_scalar: Any = matplotlib.colormaps[cmap] # Newer version. 844 else: 845 rgb_from_scalar = matplotlib.pyplot.cm.get_cmap(cmap) # type: ignore # pylint: disable=no-member 846 else: 847 rgb_from_scalar = cmap 848 a = rgb_from_scalar(a) 849 # If there is a fully opaque alpha channel, remove it. 850 if a.shape[-1] == 4 and np.all(to_float01(a[..., 3])) == 1.0: 851 a = a[..., :3] 852 return a 853 854 855def compress_image( 856 image: _ArrayLike, *, fmt: str = 'png', **kwargs: Any 857) -> bytes: 858 """Returns a buffer containing a compressed image. 859 860 Args: 861 image: Array in a format supported by `PIL`, e.g. np.uint8 or np.uint16. 862 fmt: Desired compression encoding, e.g. 'png'. 863 **kwargs: Options for `PIL.save()`, e.g. `optimize=True` for greater 864 compression. 865 """ 866 image = _as_valid_media_array(image) 867 with io.BytesIO() as output: 868 _pil_image(image).save(output, format=fmt, **kwargs) 869 return output.getvalue() 870 871 872def decompress_image( 873 data: bytes, dtype: _DTypeLike = None, apply_exif_transpose: bool = True 874) -> _NDArray: 875 """Returns an image from a compressed data buffer. 876 877 Decoding is performed using `PIL`, which supports `uint8` images with 1, 3, 878 or 4 channels and `uint16` images with a single channel. 879 880 Args: 881 data: Buffer containing compressed image. 882 dtype: Data type of the returned array. If None, `np.uint8` or `np.uint16` 883 is inferred automatically. 884 apply_exif_transpose: If True, rotate image according to EXIF orientation. 885 """ 886 pil_image = PIL.Image.open(io.BytesIO(data)) 887 if apply_exif_transpose: 888 tmp_image = PIL.ImageOps.exif_transpose(pil_image) # Future: in_place=True. 889 assert tmp_image 890 pil_image = tmp_image 891 if dtype is None: 892 dtype = np.uint16 if pil_image.mode.startswith('I') else np.uint8 893 return np.array(pil_image, dtype=dtype) 894 895 896def html_from_compressed_image( 897 data: bytes, 898 width: int, 899 height: int, 900 *, 901 title: str | None = None, 902 border: bool | str = False, 903 pixelated: bool = True, 904 fmt: str = 'png', 905) -> str: 906 """Returns an HTML string with an image tag containing encoded data. 907 908 Args: 909 data: Compressed image bytes. 910 width: Width of HTML image in pixels. 911 height: Height of HTML image in pixels. 912 title: Optional text shown centered above image. 913 border: If `bool`, whether to place a black boundary around the image, or if 914 `str`, the boundary CSS style. 915 pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'. 916 fmt: Compression encoding. 917 """ 918 b64 = base64.b64encode(data).decode('utf-8') 919 if isinstance(border, str): 920 border = f'{border}; ' 921 elif border: 922 border = 'border:1px solid black; ' 923 else: 924 border = '' 925 s_pixelated = 'pixelated' if pixelated else 'auto' 926 s = ( 927 f'<img width="{width}" height="{height}"' 928 f' style="{border}image-rendering:{s_pixelated}; object-fit:cover;"' 929 f' src="data:image/{fmt};base64,{b64}"/>' 930 ) 931 if title is not None: 932 s = f"""<div style="display:flex; align-items:left;"> 933 <div style="display:flex; flex-direction:column; align-items:center;"> 934 <div>{title}</div><div>{s}</div></div></div>""" 935 return s 936 937 938def _get_width_height( 939 width: int | None, height: int | None, shape: tuple[int, int] 940) -> tuple[int, int]: 941 """Returns (width, height) given optional parameters and image shape.""" 942 assert len(shape) == 2, shape 943 if width and height: 944 return width, height 945 if width and not height: 946 return width, int(width * (shape[0] / shape[1]) + 0.5) 947 if height and not width: 948 return int(height * (shape[1] / shape[0]) + 0.5), height 949 return shape[::-1] 950 951 952def _ensure_mapped_to_rgb( 953 image: _ArrayLike, 954 *, 955 vmin: float | None = None, 956 vmax: float | None = None, 957 cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray', 958) -> _NDArray: 959 """Ensure image is mapped to RGB.""" 960 image = _as_valid_media_array(image) 961 if not (image.ndim == 2 or (image.ndim == 3 and image.shape[2] in (1, 3, 4))): 962 raise ValueError( 963 f'Image with shape {image.shape} is neither a 2D array' 964 ' nor a 3D array with 1, 3, or 4 channels.' 965 ) 966 if image.ndim == 3 and image.shape[2] == 1: 967 image = image[:, :, 0] 968 if image.ndim == 2: 969 image = to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap) 970 return image 971 972 973def show_image( 974 image: _ArrayLike, *, title: str | None = None, **kwargs: Any 975) -> str | None: 976 """Displays an image in the notebook and optionally saves it to a file. 977 978 See `show_images`. 979 980 >>> show_image(np.random.rand(100, 100)) 981 >>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8')) 982 >>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100) 983 >>> show_image(read_image('/tmp/image.png')) 984 >>> url = 'https://github.com/hhoppe/data/raw/main/image.png' 985 >>> show_image(read_image(url)) 986 987 Args: 988 image: 2D array-like, or 3D array-like with 1, 3, or 4 channels. 989 title: Optional text shown centered above the image. 990 **kwargs: See `show_images`. 991 992 Returns: 993 html string if `return_html` is `True`. 994 """ 995 return show_images([np.asarray(image)], [title], **kwargs) 996 997 998def show_images( 999 images: Iterable[_ArrayLike] | Mapping[str, _ArrayLike], 1000 titles: Iterable[str | None] | None = None, 1001 *, 1002 width: int | None = None, 1003 height: int | None = None, 1004 downsample: bool = True, 1005 columns: int | None = None, 1006 vmin: float | None = None, 1007 vmax: float | None = None, 1008 cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray', 1009 border: bool | str = False, 1010 ylabel: str = '', 1011 html_class: str = 'show_images', 1012 pixelated: bool | None = None, 1013 return_html: bool = False, 1014) -> str | None: 1015 """Displays a row of images in the IPython/Jupyter notebook. 1016 1017 If a directory has been specified using `set_show_save_dir`, also saves each 1018 titled image to a file in that directory based on its title. 1019 1020 >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64)) 1021 >>> show_images([image1, image2]) 1022 >>> show_images({'random image': image1, 'color ramp': image2}, height=128) 1023 >>> show_images([image1, image2] * 5, columns=4, border=True) 1024 1025 Args: 1026 images: Iterable of images, or dictionary of `{title: image}`. Each image 1027 must be either a 2D array or a 3D array with 1, 3, or 4 channels. 1028 titles: Optional strings shown above the corresponding images. 1029 width: Optional, overrides displayed width (in pixels). 1030 height: Optional, overrides displayed height (in pixels). 1031 downsample: If True, each image whose width or height is greater than the 1032 specified `width` or `height` is resampled to the display resolution. This 1033 improves antialiasing and reduces the size of the notebook. 1034 columns: Optional, maximum number of images per row. 1035 vmin: For single-channel image, explicit min value for display. 1036 vmax: For single-channel image, explicit max value for display. 1037 cmap: For single-channel image, `pyplot` color map or callable to map 1D to 1038 3D color. 1039 border: If `bool`, whether to place a black boundary around the image, or if 1040 `str`, the boundary CSS style. 1041 ylabel: Text (rotated by 90 degrees) shown on the left of each row. 1042 html_class: CSS class name used in definition of HTML element. 1043 pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if 1044 False, sets 'image-rendering: auto'; if None, uses pixelated rendering 1045 only on images for which `width` or `height` introduces magnification. 1046 return_html: If `True` return the raw HTML `str` instead of displaying. 1047 1048 Returns: 1049 html string if `return_html` is `True`. 1050 """ 1051 if isinstance(images, Mapping): 1052 if titles is not None: 1053 raise ValueError('Cannot have images dictionary and titles parameter.') 1054 list_titles, list_images = list(images.keys()), list(images.values()) 1055 else: 1056 list_images = list(images) 1057 list_titles = [None] * len(list_images) if titles is None else list(titles) 1058 if len(list_images) != len(list_titles): 1059 raise ValueError( 1060 'Number of images does not match number of titles' 1061 f' ({len(list_images)} vs {len(list_titles)}).' 1062 ) 1063 1064 list_images = [ 1065 _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap) 1066 for image in list_images 1067 ] 1068 1069 def maybe_downsample(image: _NDArray) -> _NDArray: 1070 shape: tuple[int, int] = image.shape[:2] # type: ignore[assignment] 1071 w, h = _get_width_height(width, height, shape) 1072 if w < shape[1] or h < shape[0]: 1073 image = resize_image(image, (h, w)) 1074 return image 1075 1076 if downsample: 1077 list_images = [maybe_downsample(image) for image in list_images] 1078 png_datas = [compress_image(to_uint8(image)) for image in list_images] 1079 1080 for title, png_data in zip(list_titles, png_datas): 1081 if title is not None and _config.show_save_dir: 1082 path = pathlib.Path(_config.show_save_dir) / f'{title}.png' 1083 with _open(path, mode='wb') as f: 1084 f.write(png_data) 1085 1086 def html_from_compressed_images() -> str: 1087 html_strings = [] 1088 for image, title, png_data in zip(list_images, list_titles, png_datas): 1089 w, h = _get_width_height(width, height, image.shape[:2]) 1090 magnified = h > image.shape[0] or w > image.shape[1] 1091 pixelated2 = pixelated if pixelated is not None else magnified 1092 html_strings.append( 1093 html_from_compressed_image( 1094 png_data, w, h, title=title, border=border, pixelated=pixelated2 1095 ) 1096 ) 1097 # Create single-row tables each with no more than 'columns' elements. 1098 table_strings = [] 1099 for row_html_strings in _chunked(html_strings, columns): 1100 td = '<td style="padding:1px;">' 1101 s = ''.join(f'{td}{e}</td>' for e in row_html_strings) 1102 if ylabel: 1103 style = 'writing-mode:vertical-lr; transform:rotate(180deg);' 1104 s = f'{td}<span style="{style}">{ylabel}</span></td>' + s 1105 table_strings.append( 1106 f'<table class="{html_class}"' 1107 f' style="border-spacing:0px;"><tr>{s}</tr></table>' 1108 ) 1109 return ''.join(table_strings) 1110 1111 s = html_from_compressed_images() 1112 while len(s) > _IPYTHON_HTML_SIZE_LIMIT * 0.5: 1113 list_images = [image[::2, ::2] for image in list_images] 1114 png_datas = [compress_image(to_uint8(image)) for image in list_images] 1115 s = html_from_compressed_images() 1116 if return_html: 1117 return s 1118 _display_html(s) 1119 return None 1120 1121 1122def compare_images( 1123 images: Iterable[_ArrayLike], 1124 *, 1125 vmin: float | None = None, 1126 vmax: float | None = None, 1127 cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray', 1128) -> None: 1129 """Compare two images using an interactive slider. 1130 1131 Displays an HTML slider component to interactively swipe between two images. 1132 The slider functionality requires that the web browser have Internet access. 1133 See additional info in `https://github.com/sneas/img-comparison-slider`. 1134 1135 >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64)) 1136 >>> compare_images([image1, image2]) 1137 1138 Args: 1139 images: Iterable of images. Each image must be either a 2D array or a 3D 1140 array with 1, 3, or 4 channels. There must be exactly two images. 1141 vmin: For single-channel image, explicit min value for display. 1142 vmax: For single-channel image, explicit max value for display. 1143 cmap: For single-channel image, `pyplot` color map or callable to map 1D to 1144 3D color. 1145 """ 1146 list_images = [ 1147 _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap) 1148 for image in images 1149 ] 1150 if len(list_images) != 2: 1151 raise ValueError('The number of images must be 2.') 1152 png_datas = [compress_image(to_uint8(image)) for image in list_images] 1153 b64_1, b64_2 = [ 1154 base64.b64encode(png_data).decode('utf-8') for png_data in png_datas 1155 ] 1156 s = _IMAGE_COMPARISON_HTML.replace('{b64_1}', b64_1).replace('{b64_2}', b64_2) 1157 _display_html(s) 1158 1159 1160# ** Video I/O. 1161 1162 1163def _filename_suffix_from_codec(codec: str) -> str: 1164 return '.gif' if codec == 'gif' else '.mp4' 1165 1166 1167def _get_ffmpeg_path() -> str: 1168 path = _search_for_ffmpeg_path() 1169 if not path: 1170 raise RuntimeError( 1171 f"Program '{_config.ffmpeg_name_or_path}' is not found;" 1172 " perhaps install ffmpeg using 'apt install ffmpeg'." 1173 ) 1174 return path 1175 1176 1177def video_is_available() -> bool: 1178 """Returns True if the program `ffmpeg` is found. 1179 1180 See also `set_ffmpeg`. 1181 """ 1182 return _search_for_ffmpeg_path() is not None 1183 1184 1185class VideoMetadata(NamedTuple): 1186 """Represents the data stored in a video container header. 1187 1188 Attributes: 1189 num_images: Number of frames that is expected from the video stream. This 1190 is estimated from the framerate and the duration stored in the video 1191 header, so it might be inexact. We set the value to -1 if number of 1192 frames is not found in the header. 1193 shape: The dimensions (height, width) of each video frame. 1194 fps: The framerate in frames per second. 1195 bps: The estimated bitrate of the video stream in bits per second, retrieved 1196 from the video header. 1197 """ 1198 1199 num_images: int 1200 shape: tuple[int, int] 1201 fps: float 1202 bps: int | None 1203 1204 1205def _get_video_metadata(path: _Path) -> VideoMetadata: 1206 """Returns attributes of video stored in the specified local file.""" 1207 if not pathlib.Path(path).is_file(): 1208 raise RuntimeError(f"Video file '{path}' is not found.") 1209 command = [ 1210 _get_ffmpeg_path(), 1211 '-nostdin', 1212 '-i', 1213 str(path), 1214 '-acodec', 1215 'copy', 1216 '-vcodec', 1217 'copy', 1218 '-f', 1219 'null', 1220 '-', 1221 ] 1222 with subprocess.Popen( 1223 command, stderr=subprocess.PIPE, encoding='utf-8' 1224 ) as proc: 1225 _, err = proc.communicate() 1226 bps = fps = num_images = width = height = rotation = None 1227 for line in err.split('\n'): 1228 if match := re.search(r', bitrate: *([\d.]+) kb/s', line): 1229 bps = int(match.group(1)) * 1000 1230 if matches := re.findall(r'frame= *(\d+) ', line): 1231 num_images = int(matches[-1]) 1232 if 'Stream #0:' in line and ': Video:' in line: 1233 if not (match := re.search(r', (\d+)x(\d+)', line)): 1234 raise RuntimeError(f'Unable to parse video dimensions in line {line}') 1235 width, height = int(match.group(1)), int(match.group(2)) 1236 if match := re.search(r', ([\d.]+) fps', line): 1237 fps = float(match.group(1)) 1238 elif str(path).endswith('.gif'): 1239 # Some GIF files lack a framerate attribute; use a reasonable default. 1240 fps = 10 1241 else: 1242 raise RuntimeError(f'Unable to parse video framerate in line {line}') 1243 if match := re.fullmatch(r'\s*rotate\s*:\s*(\d+)', line): 1244 rotation = int(match.group(1)) 1245 if match := re.fullmatch(r'.*rotation of (-?\d+).*\sdegrees', line): 1246 rotation = int(match.group(1)) 1247 if not num_images: 1248 num_images = -1 1249 if not width: 1250 raise RuntimeError(f'Unable to parse video header: {err}') 1251 # By default, ffmpeg enables "-autorotate"; we just fix the dimensions. 1252 if rotation in (90, 270, -90, -270): 1253 width, height = height, width 1254 assert height is not None and width is not None 1255 shape = height, width 1256 assert fps is not None 1257 return VideoMetadata(num_images, shape, fps, bps) 1258 1259 1260class _VideoIO: 1261 """Base class for `VideoReader` and `VideoWriter`.""" 1262 1263 def _get_pix_fmt(self, dtype: _DType, image_format: str) -> str: 1264 """Returns ffmpeg pix_fmt given data type and image format.""" 1265 native_endian_suffix = {'little': 'le', 'big': 'be'}[sys.byteorder] 1266 return { 1267 np.uint8: { 1268 'rgb': 'rgb24', 1269 'yuv': 'yuv444p', 1270 'gray': 'gray', 1271 }, 1272 np.uint16: { 1273 'rgb': 'rgb48' + native_endian_suffix, 1274 'yuv': 'yuv444p16' + native_endian_suffix, 1275 'gray': 'gray16' + native_endian_suffix, 1276 }, 1277 }[dtype.type][image_format] 1278 1279 1280class VideoReader(_VideoIO): 1281 """Context to read a compressed video as an iterable over its images. 1282 1283 >>> with VideoReader('/tmp/river.mp4') as reader: 1284 ... print(f'Video has {reader.num_images} images with shape={reader.shape},' 1285 ... f' at {reader.fps} frames/sec and {reader.bps} bits/sec.') 1286 ... for image in reader: 1287 ... print(image.shape) 1288 1289 >>> with VideoReader('/tmp/river.mp4') as reader: 1290 ... video = np.array(tuple(reader)) 1291 1292 >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4' 1293 >>> with VideoReader(url) as reader: 1294 ... show_video(reader) 1295 1296 Attributes: 1297 path_or_url: Location of input video. 1298 output_format: Format of output images (default 'rgb'). If 'rgb', each 1299 image has shape=(height, width, 3) with R, G, B values. If 'yuv', each 1300 image has shape=(height, width, 3) with Y, U, V values. If 'gray', each 1301 image has shape=(height, width). 1302 dtype: Data type for output images. The default is `np.uint8`. Use of 1303 `np.uint16` allows reading 10-bit or 12-bit data without precision loss. 1304 metadata: Object storing the information retrieved from the video header. 1305 Its attributes are copied as attributes in this class. 1306 num_images: Number of frames that is expected from the video stream. This 1307 is estimated from the framerate and the duration stored in the video 1308 header, so it might be inexact. 1309 shape: The dimensions (height, width) of each video frame. 1310 fps: The framerate in frames per second. 1311 bps: The estimated bitrate of the video stream in bits per second, retrieved 1312 from the video header. 1313 """ 1314 1315 path_or_url: _Path 1316 output_format: str 1317 dtype: _DType 1318 metadata: VideoMetadata 1319 num_images: int 1320 shape: tuple[int, int] 1321 fps: float 1322 bps: int | None 1323 _num_bytes_per_image: int 1324 1325 def __init__( 1326 self, 1327 path_or_url: _Path, 1328 *, 1329 output_format: str = 'rgb', 1330 dtype: _DTypeLike = np.uint8, 1331 ): 1332 if output_format not in {'rgb', 'yuv', 'gray'}: 1333 raise ValueError( 1334 f'Output format {output_format} is not rgb, yuv, or gray.' 1335 ) 1336 self.path_or_url = path_or_url 1337 self.output_format = output_format 1338 self.dtype = np.dtype(dtype) 1339 if self.dtype.type not in (np.uint8, np.uint16): 1340 raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.') 1341 self._read_via_local_file: Any = None 1342 self._popen: subprocess.Popen[bytes] | None = None 1343 self._proc: subprocess.Popen[bytes] | None = None 1344 1345 def __enter__(self) -> 'VideoReader': 1346 ffmpeg_path = _get_ffmpeg_path() 1347 try: 1348 self._read_via_local_file = _read_via_local_file(self.path_or_url) 1349 # pylint: disable-next=no-member 1350 tmp_name = self._read_via_local_file.__enter__() 1351 1352 self.metadata = _get_video_metadata(tmp_name) 1353 self.num_images, self.shape, self.fps, self.bps = self.metadata 1354 pix_fmt = self._get_pix_fmt(self.dtype, self.output_format) 1355 num_channels = {'rgb': 3, 'yuv': 3, 'gray': 1}[self.output_format] 1356 bytes_per_channel = self.dtype.itemsize 1357 self._num_bytes_per_image = ( 1358 math.prod(self.shape) * num_channels * bytes_per_channel 1359 ) 1360 1361 command = [ 1362 ffmpeg_path, 1363 '-v', 1364 'panic', 1365 '-nostdin', 1366 '-i', 1367 tmp_name, 1368 '-vcodec', 1369 'rawvideo', 1370 '-f', 1371 'image2pipe', 1372 '-pix_fmt', 1373 pix_fmt, 1374 '-vsync', 1375 'vfr', 1376 '-', 1377 ] 1378 self._popen = subprocess.Popen( 1379 command, stdout=subprocess.PIPE, stderr=subprocess.PIPE 1380 ) 1381 self._proc = self._popen.__enter__() 1382 except Exception: 1383 self.__exit__(None, None, None) 1384 raise 1385 return self 1386 1387 def __exit__(self, *_: Any) -> None: 1388 self.close() 1389 1390 def read(self) -> _NDArray | None: 1391 """Reads a video image frame (or None if at end of file). 1392 1393 Returns: 1394 A numpy array in the format specified by `output_format`, i.e., a 3D 1395 array with 3 color channels, except for format 'gray' which is 2D. 1396 """ 1397 assert self._proc, 'Error: reading from an already closed context.' 1398 stdout = self._proc.stdout 1399 assert stdout is not None 1400 data = stdout.read(self._num_bytes_per_image) 1401 if not data: # Due to either end-of-file or subprocess error. 1402 self.close() # Raises exception if subprocess had error. 1403 return None # To indicate end-of-file. 1404 assert len(data) == self._num_bytes_per_image 1405 image = np.frombuffer(data, dtype=self.dtype) 1406 if self.output_format == 'rgb': 1407 image = image.reshape(*self.shape, 3) 1408 elif self.output_format == 'yuv': # Convert from planar YUV to pixel YUV. 1409 image = np.moveaxis(image.reshape(3, *self.shape), 0, 2) 1410 elif self.output_format == 'gray': # Generate 2D rather than 3D ndimage. 1411 image = image.reshape(*self.shape) 1412 else: 1413 raise AssertionError 1414 return image 1415 1416 def __iter__(self) -> Iterator[_NDArray]: 1417 while True: 1418 image = self.read() 1419 if image is None: 1420 return 1421 yield image 1422 1423 def close(self) -> None: 1424 """Terminates video reader. (Called automatically at end of context.)""" 1425 if self._popen: 1426 self._popen.__exit__(None, None, None) 1427 self._popen = None 1428 self._proc = None 1429 if self._read_via_local_file: 1430 # pylint: disable-next=no-member 1431 self._read_via_local_file.__exit__(None, None, None) 1432 self._read_via_local_file = None 1433 1434 1435class VideoWriter(_VideoIO): 1436 """Context to write a compressed video. 1437 1438 >>> shape = 480, 640 1439 >>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer: 1440 ... for image in moving_circle(shape, num_images=60): 1441 ... writer.add_image(image) 1442 >>> show_video(read_video('/tmp/v.mp4')) 1443 1444 1445 Bitrate control may be specified using at most one of: `bps`, `qp`, or `crf`. 1446 If none are specified, `qp` is set to a default value. 1447 See https://slhck.info/video/2017/03/01/rate-control.html 1448 1449 If codec is 'gif', the args `bps`, `qp`, `crf`, and `encoded_format` are 1450 ignored. 1451 1452 Attributes: 1453 path: Output video. Its suffix (e.g. '.mp4') determines the video container 1454 format. The suffix must be '.gif' if the codec is 'gif'. 1455 shape: 2D spatial dimensions (height, width) of video image frames. The 1456 dimensions must be even if 'encoded_format' has subsampled chroma (e.g., 1457 'yuv420p' or 'yuv420p10le'). 1458 codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264', 1459 'hevc', 'vp9', or 'gif'). 1460 metadata: Optional VideoMetadata object whose `fps` and `bps` attributes are 1461 used if not specified as explicit parameters. 1462 fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif'). 1463 bps: Requested average bits-per-second bitrate (default None). 1464 qp: Quantization parameter for video compression quality (default None). 1465 crf: Constant rate factor for video compression quality (default None). 1466 ffmpeg_args: Additional arguments for `ffmpeg` command, e.g. '-g 30' to 1467 introduce I-frames, or '-bf 0' to omit B-frames. 1468 input_format: Format of input images (default 'rgb'). If 'rgb', each image 1469 has shape=(height, width, 3) or (height, width). If 'yuv', each image has 1470 shape=(height, width, 3) with Y, U, V values. If 'gray', each image has 1471 shape=(height, width). 1472 dtype: Expected data type for input images (any float input images are 1473 converted to `dtype`). The default is `np.uint8`. Use of `np.uint16` is 1474 necessary when encoding >8 bits/channel. 1475 encoded_format: Pixel format as defined by `ffmpeg -pix_fmts`, e.g., 1476 'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma), 1477 'yuv420p10le' (10-bit per channel), etc. The default (None) selects 1478 'yuv420p' if all shape dimensions are even, else 'yuv444p'. 1479 """ 1480 1481 def __init__( 1482 self, 1483 path: _Path, 1484 shape: tuple[int, int], 1485 *, 1486 codec: str = 'h264', 1487 metadata: VideoMetadata | None = None, 1488 fps: float | None = None, 1489 bps: int | None = None, 1490 qp: int | None = None, 1491 crf: float | None = None, 1492 ffmpeg_args: str | Sequence[str] = '', 1493 input_format: str = 'rgb', 1494 dtype: _DTypeLike = np.uint8, 1495 encoded_format: str | None = None, 1496 ) -> None: 1497 _check_2d_shape(shape) 1498 if fps is None and metadata: 1499 fps = metadata.fps 1500 if fps is None: 1501 fps = 25.0 if codec == 'gif' else 60.0 1502 if fps <= 0.0: 1503 raise ValueError(f'Frame-per-second value {fps} is invalid.') 1504 if bps is None and metadata: 1505 bps = metadata.bps 1506 bps = int(bps) if bps is not None else None 1507 if bps is not None and bps <= 0: 1508 raise ValueError(f'Bitrate value {bps} is invalid.') 1509 if qp is not None and (not isinstance(qp, int) or qp <= 0): 1510 raise ValueError( 1511 f'Quantization parameter {qp} is not a positive integer.' 1512 ) 1513 num_rate_specifications = sum(x is not None for x in (bps, qp, crf)) 1514 if num_rate_specifications > 1: 1515 raise ValueError( 1516 f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).' 1517 ) 1518 ffmpeg_args = ( 1519 shlex.split(ffmpeg_args) 1520 if isinstance(ffmpeg_args, str) 1521 else list(ffmpeg_args) 1522 ) 1523 if input_format not in {'rgb', 'yuv', 'gray'}: 1524 raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.') 1525 dtype = np.dtype(dtype) 1526 if dtype.type not in (np.uint8, np.uint16): 1527 raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.') 1528 self.path = pathlib.Path(path) 1529 self.shape = shape 1530 all_dimensions_are_even = all(dim % 2 == 0 for dim in shape) 1531 if encoded_format is None: 1532 encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p' 1533 if not all_dimensions_are_even and encoded_format.startswith( 1534 ('yuv42', 'yuvj42') 1535 ): 1536 raise ValueError( 1537 f'With encoded_format {encoded_format}, video dimensions must be' 1538 f' even, but shape is {shape}.' 1539 ) 1540 self.fps = fps 1541 self.codec = codec 1542 self.bps = bps 1543 self.qp = qp 1544 self.crf = crf 1545 self.ffmpeg_args = ffmpeg_args 1546 self.input_format = input_format 1547 self.dtype = dtype 1548 self.encoded_format = encoded_format 1549 if num_rate_specifications == 0 and not ffmpeg_args: 1550 qp = 20 if math.prod(self.shape) <= 640 * 480 else 28 1551 self._bitrate_args = ( 1552 (['-vb', f'{bps}'] if bps is not None else []) 1553 + (['-qp', f'{qp}'] if qp is not None else []) 1554 + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else []) 1555 ) 1556 if self.codec == 'gif': 1557 if self.path.suffix != '.gif': 1558 raise ValueError(f"File '{self.path}' does not have a .gif suffix.") 1559 self.encoded_format = 'pal8' 1560 self._bitrate_args = [] 1561 video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse' 1562 # Less common (and likely less useful) is a per-frame color palette: 1563 # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];' 1564 # '[s1][p]paletteuse=new=1') 1565 self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args 1566 self._write_via_local_file: Any = None 1567 self._popen: subprocess.Popen[bytes] | None = None 1568 self._proc: subprocess.Popen[bytes] | None = None 1569 1570 def __enter__(self) -> 'VideoWriter': 1571 ffmpeg_path = _get_ffmpeg_path() 1572 input_pix_fmt = self._get_pix_fmt(self.dtype, self.input_format) 1573 try: 1574 self._write_via_local_file = _write_via_local_file(self.path) 1575 # pylint: disable-next=no-member 1576 tmp_name = self._write_via_local_file.__enter__() 1577 1578 # Writing to stdout using ('-f', 'mp4', '-') would require 1579 # ('-movflags', 'frag_keyframe+empty_moov') and the result is nonportable. 1580 height, width = self.shape 1581 command = ( 1582 [ 1583 ffmpeg_path, 1584 '-v', 1585 'error', 1586 '-f', 1587 'rawvideo', 1588 '-vcodec', 1589 'rawvideo', 1590 '-pix_fmt', 1591 input_pix_fmt, 1592 '-s', 1593 f'{width}x{height}', 1594 '-r', 1595 f'{self.fps}', 1596 '-i', 1597 '-', 1598 '-an', 1599 '-vcodec', 1600 self.codec, 1601 '-pix_fmt', 1602 self.encoded_format, 1603 ] 1604 + self._bitrate_args 1605 + self.ffmpeg_args 1606 + ['-y', tmp_name] 1607 ) 1608 self._popen = subprocess.Popen( 1609 command, stdin=subprocess.PIPE, stderr=subprocess.PIPE 1610 ) 1611 self._proc = self._popen.__enter__() 1612 except Exception: 1613 self.__exit__(None, None, None) 1614 raise 1615 return self 1616 1617 def __exit__(self, *_: Any) -> None: 1618 self.close() 1619 1620 def add_image(self, image: _NDArray) -> None: 1621 """Writes a video frame. 1622 1623 Args: 1624 image: Array whose dtype and first two dimensions must match the `dtype` 1625 and `shape` specified in `VideoWriter` initialization. If 1626 `input_format` is 'gray', the image must be 2D. For the 'rgb' 1627 input_format, the image may be either 2D (interpreted as grayscale) or 1628 3D with three (R, G, B) channels. For the 'yuv' input_format, the image 1629 must be 3D with three (Y, U, V) channels. 1630 1631 Raises: 1632 RuntimeError: If there is an error writing to the output file. 1633 """ 1634 assert self._proc, 'Error: writing to an already closed context.' 1635 if issubclass(image.dtype.type, (np.floating, np.bool_)): 1636 image = to_type(image, self.dtype) 1637 if image.dtype != self.dtype: 1638 raise ValueError(f'Image type {image.dtype} != {self.dtype}.') 1639 if self.input_format == 'gray': 1640 if image.ndim != 2: 1641 raise ValueError(f'Image dimensions {image.shape} are not 2D.') 1642 else: 1643 if image.ndim == 2 and self.input_format == 'rgb': 1644 image = np.dstack((image, image, image)) 1645 if not (image.ndim == 3 and image.shape[2] == 3): 1646 raise ValueError(f'Image dimensions {image.shape} are invalid.') 1647 if image.shape[:2] != self.shape: 1648 raise ValueError( 1649 f'Image dimensions {image.shape[:2]} do not match' 1650 f' those of the initialized video {self.shape}.' 1651 ) 1652 if self.input_format == 'yuv': # Convert from per-pixel YUV to planar YUV. 1653 image = np.moveaxis(image, 2, 0) 1654 data = image.tobytes() 1655 stdin = self._proc.stdin 1656 assert stdin is not None 1657 if stdin.write(data) != len(data): 1658 self._proc.wait() 1659 stderr = self._proc.stderr 1660 assert stderr is not None 1661 s = stderr.read().decode() 1662 raise RuntimeError(f"Error writing '{self.path}': {s}") 1663 1664 def close(self) -> None: 1665 """Finishes writing the video. (Called automatically at end of context.)""" 1666 if self._popen: 1667 assert self._proc, 'Error: closing an already closed context.' 1668 stdin = self._proc.stdin 1669 assert stdin is not None 1670 stdin.close() 1671 if self._proc.wait(): 1672 stderr = self._proc.stderr 1673 assert stderr is not None 1674 s = stderr.read().decode() 1675 raise RuntimeError(f"Error writing '{self.path}': {s}") 1676 self._popen.__exit__(None, None, None) 1677 self._popen = None 1678 self._proc = None 1679 if self._write_via_local_file: 1680 # pylint: disable-next=no-member 1681 self._write_via_local_file.__exit__(None, None, None) 1682 self._write_via_local_file = None 1683 1684 1685class _VideoArray(npt.NDArray[Any]): 1686 """Wrapper to add a VideoMetadata `metadata` attribute to a numpy array.""" 1687 1688 metadata: VideoMetadata | None 1689 1690 def __new__( 1691 cls: Type['_VideoArray'], 1692 input_array: _NDArray, 1693 metadata: VideoMetadata | None = None, 1694 ) -> '_VideoArray': 1695 obj: _VideoArray = np.asarray(input_array).view(cls) 1696 obj.metadata = metadata 1697 return obj 1698 1699 def __array_finalize__(self, obj: Any) -> None: 1700 if obj is None: 1701 return 1702 self.metadata = getattr(obj, 'metadata', None) 1703 1704 1705def read_video(path_or_url: _Path, **kwargs: Any) -> _VideoArray: 1706 """Returns an array containing all images read from a compressed video file. 1707 1708 >>> video = read_video('/tmp/river.mp4') 1709 >>> print(f'The framerate is {video.metadata.fps} frames/s.') 1710 >>> show_video(video) 1711 1712 >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4' 1713 >>> show_video(read_video(url)) 1714 1715 Args: 1716 path_or_url: Input video file. 1717 **kwargs: Additional parameters for `VideoReader`. 1718 1719 Returns: 1720 A 4D `numpy` array with dimensions (frame, height, width, channel), or a 3D 1721 array if `output_format` is specified as 'gray'. The returned array has an 1722 attribute `metadata` containing `VideoMetadata` information. This enables 1723 `show_video` to retrieve the framerate in `metadata.fps`. Note that the 1724 metadata attribute is lost in most subsequent `numpy` operations. 1725 """ 1726 with VideoReader(path_or_url, **kwargs) as reader: 1727 return _VideoArray(np.array(tuple(reader)), metadata=reader.metadata) 1728 1729 1730def write_video(path: _Path, images: Iterable[_NDArray], **kwargs: Any) -> None: 1731 """Writes images to a compressed video file. 1732 1733 >>> video = moving_circle((480, 640), num_images=60) 1734 >>> write_video('/tmp/v.mp4', video, fps=60, qp=18) 1735 >>> show_video(read_video('/tmp/v.mp4')) 1736 1737 Args: 1738 path: Output video file. 1739 images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D 1740 arrays. 1741 **kwargs: Additional parameters for `VideoWriter`. 1742 """ 1743 first_image, images = _peek_first(images) 1744 shape: tuple[int, int] = first_image.shape[:2] # type: ignore[assignment] 1745 dtype = first_image.dtype 1746 if dtype == bool: 1747 dtype = np.dtype(np.uint8) 1748 elif np.issubdtype(dtype, np.floating): 1749 dtype = np.dtype(np.uint16) 1750 kwargs = {'metadata': getattr(images, 'metadata', None), **kwargs} 1751 with VideoWriter(path, shape=shape, dtype=dtype, **kwargs) as writer: 1752 for image in images: 1753 writer.add_image(image) 1754 1755 1756def compress_video( 1757 images: Iterable[_NDArray], *, codec: str = 'h264', **kwargs: Any 1758) -> bytes: 1759 """Returns a buffer containing a compressed video. 1760 1761 The video container is 'mp4' except when `codec` is 'gif'. 1762 1763 >>> video = read_video('/tmp/river.mp4') 1764 >>> data = compress_video(video, bps=10_000_000) 1765 >>> print(len(data)) 1766 1767 >>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10) 1768 1769 Args: 1770 images: Iterable over video frames. 1771 codec: Compression algorithm as defined by `ffmpeg -codecs` (e.g., 'h264', 1772 'hevc', 'vp9', or 'gif'). 1773 **kwargs: Additional parameters for `VideoWriter`. 1774 1775 Returns: 1776 A bytes buffer containing the compressed video. 1777 """ 1778 suffix = _filename_suffix_from_codec(codec) 1779 with tempfile.TemporaryDirectory() as directory_name: 1780 tmp_path = pathlib.Path(directory_name) / f'file{suffix}' 1781 write_video(tmp_path, images, codec=codec, **kwargs) 1782 return tmp_path.read_bytes() 1783 1784 1785def decompress_video(data: bytes, **kwargs: Any) -> _NDArray: 1786 """Returns video images from an MP4-compressed data buffer.""" 1787 with tempfile.TemporaryDirectory() as directory_name: 1788 tmp_path = pathlib.Path(directory_name) / 'file.mp4' 1789 tmp_path.write_bytes(data) 1790 return read_video(tmp_path, **kwargs) 1791 1792 1793def html_from_compressed_video( 1794 data: bytes, 1795 width: int, 1796 height: int, 1797 *, 1798 title: str | None = None, 1799 border: bool | str = False, 1800 loop: bool = True, 1801 autoplay: bool = True, 1802) -> str: 1803 """Returns an HTML string with a video tag containing H264-encoded data. 1804 1805 Args: 1806 data: MP4-compressed video bytes. 1807 width: Width of HTML video in pixels. 1808 height: Height of HTML video in pixels. 1809 title: Optional text shown centered above the video. 1810 border: If `bool`, whether to place a black boundary around the image, or if 1811 `str`, the boundary CSS style. 1812 loop: If True, the playback repeats forever. 1813 autoplay: If True, video playback starts without having to click. 1814 """ 1815 b64 = base64.b64encode(data).decode('utf-8') 1816 if isinstance(border, str): 1817 border = f'{border}; ' 1818 elif border: 1819 border = 'border:1px solid black; ' 1820 else: 1821 border = '' 1822 options = ( 1823 f'controls width="{width}" height="{height}"' 1824 f' style="{border}object-fit:cover;"' 1825 f'{" loop" if loop else ""}' 1826 f'{" autoplay muted" if autoplay else ""}' 1827 ) 1828 s = f"""<video {options}> 1829 <source src="data:video/mp4;base64,{b64}" type="video/mp4"/> 1830 This browser does not support the video tag. 1831 </video>""" 1832 if title is not None: 1833 s = f"""<div style="display:flex; align-items:left;"> 1834 <div style="display:flex; flex-direction:column; align-items:center;"> 1835 <div>{title}</div><div>{s}</div></div></div>""" 1836 return s 1837 1838 1839def show_video( 1840 images: Iterable[_NDArray], *, title: str | None = None, **kwargs: Any 1841) -> str | None: 1842 """Displays a video in the IPython notebook and optionally saves it to a file. 1843 1844 See `show_videos`. 1845 1846 >>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4') 1847 >>> show_video(video, title='River video') 1848 1849 >>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True) 1850 1851 >>> show_video(read_video('/tmp/river.mp4')) 1852 1853 Args: 1854 images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D 1855 arrays). 1856 title: Optional text shown centered above the video. 1857 **kwargs: See `show_videos`. 1858 1859 Returns: 1860 html string if `return_html` is `True`. 1861 """ 1862 return show_videos([images], [title], **kwargs) 1863 1864 1865def show_videos( 1866 videos: Iterable[Iterable[_NDArray]] | Mapping[str, Iterable[_NDArray]], 1867 titles: Iterable[str | None] | None = None, 1868 *, 1869 width: int | None = None, 1870 height: int | None = None, 1871 downsample: bool = True, 1872 columns: int | None = None, 1873 fps: float | None = None, 1874 bps: int | None = None, 1875 qp: int | None = None, 1876 codec: str = 'h264', 1877 ylabel: str = '', 1878 html_class: str = 'show_videos', 1879 return_html: bool = False, 1880 **kwargs: Any, 1881) -> str | None: 1882 """Displays a row of videos in the IPython notebook. 1883 1884 Creates HTML with `<video>` tags containing embedded H264-encoded bytestrings. 1885 If `codec` is set to 'gif', we instead use `<img>` tags containing embedded 1886 GIF-encoded bytestrings. Note that the resulting GIF animations skip frames 1887 when the `fps` period is not a multiple of 10 ms units (GIF frame delay 1888 units). Encoding at `fps` = 20.0, 25.0, or 50.0 works fine. 1889 1890 If a directory has been specified using `set_show_save_dir`, also saves each 1891 titled video to a file in that directory based on its title. 1892 1893 Args: 1894 videos: Iterable of videos, or dictionary of `{title: video}`. Each video 1895 must be an iterable of images. If a video object has a `metadata` 1896 (`VideoMetadata`) attribute, its `fps` field provides a default framerate. 1897 titles: Optional strings shown above the corresponding videos. 1898 width: Optional, overrides displayed width (in pixels). 1899 height: Optional, overrides displayed height (in pixels). 1900 downsample: If True, each video whose width or height is greater than the 1901 specified `width` or `height` is resampled to the display resolution. This 1902 improves antialiasing and reduces the size of the notebook. 1903 columns: Optional, maximum number of videos per row. 1904 fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF). 1905 bps: Bits-per-second bitrate (default None). 1906 qp: Quantization parameter for video compression quality (default None). 1907 codec: Compression algorithm; must be either 'h264' or 'gif'. 1908 ylabel: Text (rotated by 90 degrees) shown on the left of each row. 1909 html_class: CSS class name used in definition of HTML element. 1910 return_html: If `True` return the raw HTML `str` instead of displaying. 1911 **kwargs: Additional parameters (`border`, `loop`, `autoplay`) for 1912 `html_from_compressed_video`. 1913 1914 Returns: 1915 html string if `return_html` is `True`. 1916 """ 1917 if isinstance(videos, Mapping): 1918 if titles is not None: 1919 raise ValueError( 1920 'Cannot have both a video dictionary and a titles parameter.' 1921 ) 1922 list_titles = list(videos.keys()) 1923 list_videos = list(videos.values()) 1924 else: 1925 list_videos = list(cast('Iterable[_NDArray]', videos)) 1926 list_titles = [None] * len(list_videos) if titles is None else list(titles) 1927 if len(list_videos) != len(list_titles): 1928 raise ValueError( 1929 'Number of videos does not match number of titles' 1930 f' ({len(list_videos)} vs {len(list_titles)}).' 1931 ) 1932 if codec not in {'h264', 'gif'}: 1933 raise ValueError(f'Codec {codec} is neither h264 or gif.') 1934 1935 html_strings = [] 1936 for video, title in zip(list_videos, list_titles): 1937 metadata: VideoMetadata | None = getattr(video, 'metadata', None) 1938 first_image, video = _peek_first(video) 1939 w, h = _get_width_height(width, height, first_image.shape[:2]) 1940 if downsample and (w < first_image.shape[1] or h < first_image.shape[0]): 1941 # Not resize_video() because each image may have different depth and type. 1942 video = [resize_image(image, (h, w)) for image in video] 1943 first_image = video[0] 1944 data = compress_video( 1945 video, metadata=metadata, fps=fps, bps=bps, qp=qp, codec=codec 1946 ) 1947 if title is not None and _config.show_save_dir: 1948 suffix = _filename_suffix_from_codec(codec) 1949 path = pathlib.Path(_config.show_save_dir) / f'{title}{suffix}' 1950 with _open(path, mode='wb') as f: 1951 f.write(data) 1952 if codec == 'gif': 1953 pixelated = h > first_image.shape[0] or w > first_image.shape[1] 1954 html_string = html_from_compressed_image( 1955 data, w, h, title=title, fmt='gif', pixelated=pixelated, **kwargs 1956 ) 1957 else: 1958 html_string = html_from_compressed_video( 1959 data, w, h, title=title, **kwargs 1960 ) 1961 html_strings.append(html_string) 1962 1963 # Create single-row tables each with no more than 'columns' elements. 1964 table_strings = [] 1965 for row_html_strings in _chunked(html_strings, columns): 1966 td = '<td style="padding:1px;">' 1967 s = ''.join(f'{td}{e}</td>' for e in row_html_strings) 1968 if ylabel: 1969 style = 'writing-mode:vertical-lr; transform:rotate(180deg);' 1970 s = f'{td}<span style="{style}">{ylabel}</span></td>' + s 1971 table_strings.append( 1972 f'<table class="{html_class}"' 1973 f' style="border-spacing:0px;"><tr>{s}</tr></table>' 1974 ) 1975 s = ''.join(table_strings) 1976 if return_html: 1977 return s 1978 _display_html(s) 1979 return None 1980 1981 1982# Local Variables: 1983# fill-column: 80 1984# End:
974def show_image( 975 image: _ArrayLike, *, title: str | None = None, **kwargs: Any 976) -> str | None: 977 """Displays an image in the notebook and optionally saves it to a file. 978 979 See `show_images`. 980 981 >>> show_image(np.random.rand(100, 100)) 982 >>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8')) 983 >>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100) 984 >>> show_image(read_image('/tmp/image.png')) 985 >>> url = 'https://github.com/hhoppe/data/raw/main/image.png' 986 >>> show_image(read_image(url)) 987 988 Args: 989 image: 2D array-like, or 3D array-like with 1, 3, or 4 channels. 990 title: Optional text shown centered above the image. 991 **kwargs: See `show_images`. 992 993 Returns: 994 html string if `return_html` is `True`. 995 """ 996 return show_images([np.asarray(image)], [title], **kwargs)
Displays an image in the notebook and optionally saves it to a file.
See show_images
.
>>> show_image(np.random.rand(100, 100))
>>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8'))
>>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100)
>>> show_image(read_image('/tmp/image.png'))
>>> url = 'https://github.com/hhoppe/data/raw/main/image.png'
>>> show_image(read_image(url))
Arguments:
- image: 2D array-like, or 3D array-like with 1, 3, or 4 channels.
- title: Optional text shown centered above the image.
- **kwargs: See
show_images
.
Returns:
html string if
return_html
isTrue
.
999def show_images( 1000 images: Iterable[_ArrayLike] | Mapping[str, _ArrayLike], 1001 titles: Iterable[str | None] | None = None, 1002 *, 1003 width: int | None = None, 1004 height: int | None = None, 1005 downsample: bool = True, 1006 columns: int | None = None, 1007 vmin: float | None = None, 1008 vmax: float | None = None, 1009 cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray', 1010 border: bool | str = False, 1011 ylabel: str = '', 1012 html_class: str = 'show_images', 1013 pixelated: bool | None = None, 1014 return_html: bool = False, 1015) -> str | None: 1016 """Displays a row of images in the IPython/Jupyter notebook. 1017 1018 If a directory has been specified using `set_show_save_dir`, also saves each 1019 titled image to a file in that directory based on its title. 1020 1021 >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64)) 1022 >>> show_images([image1, image2]) 1023 >>> show_images({'random image': image1, 'color ramp': image2}, height=128) 1024 >>> show_images([image1, image2] * 5, columns=4, border=True) 1025 1026 Args: 1027 images: Iterable of images, or dictionary of `{title: image}`. Each image 1028 must be either a 2D array or a 3D array with 1, 3, or 4 channels. 1029 titles: Optional strings shown above the corresponding images. 1030 width: Optional, overrides displayed width (in pixels). 1031 height: Optional, overrides displayed height (in pixels). 1032 downsample: If True, each image whose width or height is greater than the 1033 specified `width` or `height` is resampled to the display resolution. This 1034 improves antialiasing and reduces the size of the notebook. 1035 columns: Optional, maximum number of images per row. 1036 vmin: For single-channel image, explicit min value for display. 1037 vmax: For single-channel image, explicit max value for display. 1038 cmap: For single-channel image, `pyplot` color map or callable to map 1D to 1039 3D color. 1040 border: If `bool`, whether to place a black boundary around the image, or if 1041 `str`, the boundary CSS style. 1042 ylabel: Text (rotated by 90 degrees) shown on the left of each row. 1043 html_class: CSS class name used in definition of HTML element. 1044 pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if 1045 False, sets 'image-rendering: auto'; if None, uses pixelated rendering 1046 only on images for which `width` or `height` introduces magnification. 1047 return_html: If `True` return the raw HTML `str` instead of displaying. 1048 1049 Returns: 1050 html string if `return_html` is `True`. 1051 """ 1052 if isinstance(images, Mapping): 1053 if titles is not None: 1054 raise ValueError('Cannot have images dictionary and titles parameter.') 1055 list_titles, list_images = list(images.keys()), list(images.values()) 1056 else: 1057 list_images = list(images) 1058 list_titles = [None] * len(list_images) if titles is None else list(titles) 1059 if len(list_images) != len(list_titles): 1060 raise ValueError( 1061 'Number of images does not match number of titles' 1062 f' ({len(list_images)} vs {len(list_titles)}).' 1063 ) 1064 1065 list_images = [ 1066 _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap) 1067 for image in list_images 1068 ] 1069 1070 def maybe_downsample(image: _NDArray) -> _NDArray: 1071 shape: tuple[int, int] = image.shape[:2] # type: ignore[assignment] 1072 w, h = _get_width_height(width, height, shape) 1073 if w < shape[1] or h < shape[0]: 1074 image = resize_image(image, (h, w)) 1075 return image 1076 1077 if downsample: 1078 list_images = [maybe_downsample(image) for image in list_images] 1079 png_datas = [compress_image(to_uint8(image)) for image in list_images] 1080 1081 for title, png_data in zip(list_titles, png_datas): 1082 if title is not None and _config.show_save_dir: 1083 path = pathlib.Path(_config.show_save_dir) / f'{title}.png' 1084 with _open(path, mode='wb') as f: 1085 f.write(png_data) 1086 1087 def html_from_compressed_images() -> str: 1088 html_strings = [] 1089 for image, title, png_data in zip(list_images, list_titles, png_datas): 1090 w, h = _get_width_height(width, height, image.shape[:2]) 1091 magnified = h > image.shape[0] or w > image.shape[1] 1092 pixelated2 = pixelated if pixelated is not None else magnified 1093 html_strings.append( 1094 html_from_compressed_image( 1095 png_data, w, h, title=title, border=border, pixelated=pixelated2 1096 ) 1097 ) 1098 # Create single-row tables each with no more than 'columns' elements. 1099 table_strings = [] 1100 for row_html_strings in _chunked(html_strings, columns): 1101 td = '<td style="padding:1px;">' 1102 s = ''.join(f'{td}{e}</td>' for e in row_html_strings) 1103 if ylabel: 1104 style = 'writing-mode:vertical-lr; transform:rotate(180deg);' 1105 s = f'{td}<span style="{style}">{ylabel}</span></td>' + s 1106 table_strings.append( 1107 f'<table class="{html_class}"' 1108 f' style="border-spacing:0px;"><tr>{s}</tr></table>' 1109 ) 1110 return ''.join(table_strings) 1111 1112 s = html_from_compressed_images() 1113 while len(s) > _IPYTHON_HTML_SIZE_LIMIT * 0.5: 1114 list_images = [image[::2, ::2] for image in list_images] 1115 png_datas = [compress_image(to_uint8(image)) for image in list_images] 1116 s = html_from_compressed_images() 1117 if return_html: 1118 return s 1119 _display_html(s) 1120 return None
Displays a row of images in the IPython/Jupyter notebook.
If a directory has been specified using set_show_save_dir
, also saves each
titled image to a file in that directory based on its title.
>>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
>>> show_images([image1, image2])
>>> show_images({'random image': image1, 'color ramp': image2}, height=128)
>>> show_images([image1, image2] * 5, columns=4, border=True)
Arguments:
- images: Iterable of images, or dictionary of
{title: image}
. Each image must be either a 2D array or a 3D array with 1, 3, or 4 channels. - titles: Optional strings shown above the corresponding images.
- width: Optional, overrides displayed width (in pixels).
- height: Optional, overrides displayed height (in pixels).
- downsample: If True, each image whose width or height is greater than the
specified
width
orheight
is resampled to the display resolution. This improves antialiasing and reduces the size of the notebook. - columns: Optional, maximum number of images per row.
- vmin: For single-channel image, explicit min value for display.
- vmax: For single-channel image, explicit max value for display.
- cmap: For single-channel image,
pyplot
color map or callable to map 1D to 3D color. - border: If
bool
, whether to place a black boundary around the image, or ifstr
, the boundary CSS style. - ylabel: Text (rotated by 90 degrees) shown on the left of each row.
- html_class: CSS class name used in definition of HTML element.
- pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if
False, sets 'image-rendering: auto'; if None, uses pixelated rendering
only on images for which
width
orheight
introduces magnification. - return_html: If
True
return the raw HTMLstr
instead of displaying.
Returns:
html string if
return_html
isTrue
.
1123def compare_images( 1124 images: Iterable[_ArrayLike], 1125 *, 1126 vmin: float | None = None, 1127 vmax: float | None = None, 1128 cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray', 1129) -> None: 1130 """Compare two images using an interactive slider. 1131 1132 Displays an HTML slider component to interactively swipe between two images. 1133 The slider functionality requires that the web browser have Internet access. 1134 See additional info in `https://github.com/sneas/img-comparison-slider`. 1135 1136 >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64)) 1137 >>> compare_images([image1, image2]) 1138 1139 Args: 1140 images: Iterable of images. Each image must be either a 2D array or a 3D 1141 array with 1, 3, or 4 channels. There must be exactly two images. 1142 vmin: For single-channel image, explicit min value for display. 1143 vmax: For single-channel image, explicit max value for display. 1144 cmap: For single-channel image, `pyplot` color map or callable to map 1D to 1145 3D color. 1146 """ 1147 list_images = [ 1148 _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap) 1149 for image in images 1150 ] 1151 if len(list_images) != 2: 1152 raise ValueError('The number of images must be 2.') 1153 png_datas = [compress_image(to_uint8(image)) for image in list_images] 1154 b64_1, b64_2 = [ 1155 base64.b64encode(png_data).decode('utf-8') for png_data in png_datas 1156 ] 1157 s = _IMAGE_COMPARISON_HTML.replace('{b64_1}', b64_1).replace('{b64_2}', b64_2) 1158 _display_html(s)
Compare two images using an interactive slider.
Displays an HTML slider component to interactively swipe between two images.
The slider functionality requires that the web browser have Internet access.
See additional info in https://github.com/sneas/img-comparison-slider
.
>>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
>>> compare_images([image1, image2])
Arguments:
- images: Iterable of images. Each image must be either a 2D array or a 3D array with 1, 3, or 4 channels. There must be exactly two images.
- vmin: For single-channel image, explicit min value for display.
- vmax: For single-channel image, explicit max value for display.
- cmap: For single-channel image,
pyplot
color map or callable to map 1D to 3D color.
1840def show_video( 1841 images: Iterable[_NDArray], *, title: str | None = None, **kwargs: Any 1842) -> str | None: 1843 """Displays a video in the IPython notebook and optionally saves it to a file. 1844 1845 See `show_videos`. 1846 1847 >>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4') 1848 >>> show_video(video, title='River video') 1849 1850 >>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True) 1851 1852 >>> show_video(read_video('/tmp/river.mp4')) 1853 1854 Args: 1855 images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D 1856 arrays). 1857 title: Optional text shown centered above the video. 1858 **kwargs: See `show_videos`. 1859 1860 Returns: 1861 html string if `return_html` is `True`. 1862 """ 1863 return show_videos([images], [title], **kwargs)
Displays a video in the IPython notebook and optionally saves it to a file.
See show_videos
.
>>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4')
>>> show_video(video, title='River video')
>>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True)
>>> show_video(read_video('/tmp/river.mp4'))
Arguments:
- images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D arrays).
- title: Optional text shown centered above the video.
- **kwargs: See
show_videos
.
Returns:
html string if
return_html
isTrue
.
1866def show_videos( 1867 videos: Iterable[Iterable[_NDArray]] | Mapping[str, Iterable[_NDArray]], 1868 titles: Iterable[str | None] | None = None, 1869 *, 1870 width: int | None = None, 1871 height: int | None = None, 1872 downsample: bool = True, 1873 columns: int | None = None, 1874 fps: float | None = None, 1875 bps: int | None = None, 1876 qp: int | None = None, 1877 codec: str = 'h264', 1878 ylabel: str = '', 1879 html_class: str = 'show_videos', 1880 return_html: bool = False, 1881 **kwargs: Any, 1882) -> str | None: 1883 """Displays a row of videos in the IPython notebook. 1884 1885 Creates HTML with `<video>` tags containing embedded H264-encoded bytestrings. 1886 If `codec` is set to 'gif', we instead use `<img>` tags containing embedded 1887 GIF-encoded bytestrings. Note that the resulting GIF animations skip frames 1888 when the `fps` period is not a multiple of 10 ms units (GIF frame delay 1889 units). Encoding at `fps` = 20.0, 25.0, or 50.0 works fine. 1890 1891 If a directory has been specified using `set_show_save_dir`, also saves each 1892 titled video to a file in that directory based on its title. 1893 1894 Args: 1895 videos: Iterable of videos, or dictionary of `{title: video}`. Each video 1896 must be an iterable of images. If a video object has a `metadata` 1897 (`VideoMetadata`) attribute, its `fps` field provides a default framerate. 1898 titles: Optional strings shown above the corresponding videos. 1899 width: Optional, overrides displayed width (in pixels). 1900 height: Optional, overrides displayed height (in pixels). 1901 downsample: If True, each video whose width or height is greater than the 1902 specified `width` or `height` is resampled to the display resolution. This 1903 improves antialiasing and reduces the size of the notebook. 1904 columns: Optional, maximum number of videos per row. 1905 fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF). 1906 bps: Bits-per-second bitrate (default None). 1907 qp: Quantization parameter for video compression quality (default None). 1908 codec: Compression algorithm; must be either 'h264' or 'gif'. 1909 ylabel: Text (rotated by 90 degrees) shown on the left of each row. 1910 html_class: CSS class name used in definition of HTML element. 1911 return_html: If `True` return the raw HTML `str` instead of displaying. 1912 **kwargs: Additional parameters (`border`, `loop`, `autoplay`) for 1913 `html_from_compressed_video`. 1914 1915 Returns: 1916 html string if `return_html` is `True`. 1917 """ 1918 if isinstance(videos, Mapping): 1919 if titles is not None: 1920 raise ValueError( 1921 'Cannot have both a video dictionary and a titles parameter.' 1922 ) 1923 list_titles = list(videos.keys()) 1924 list_videos = list(videos.values()) 1925 else: 1926 list_videos = list(cast('Iterable[_NDArray]', videos)) 1927 list_titles = [None] * len(list_videos) if titles is None else list(titles) 1928 if len(list_videos) != len(list_titles): 1929 raise ValueError( 1930 'Number of videos does not match number of titles' 1931 f' ({len(list_videos)} vs {len(list_titles)}).' 1932 ) 1933 if codec not in {'h264', 'gif'}: 1934 raise ValueError(f'Codec {codec} is neither h264 or gif.') 1935 1936 html_strings = [] 1937 for video, title in zip(list_videos, list_titles): 1938 metadata: VideoMetadata | None = getattr(video, 'metadata', None) 1939 first_image, video = _peek_first(video) 1940 w, h = _get_width_height(width, height, first_image.shape[:2]) 1941 if downsample and (w < first_image.shape[1] or h < first_image.shape[0]): 1942 # Not resize_video() because each image may have different depth and type. 1943 video = [resize_image(image, (h, w)) for image in video] 1944 first_image = video[0] 1945 data = compress_video( 1946 video, metadata=metadata, fps=fps, bps=bps, qp=qp, codec=codec 1947 ) 1948 if title is not None and _config.show_save_dir: 1949 suffix = _filename_suffix_from_codec(codec) 1950 path = pathlib.Path(_config.show_save_dir) / f'{title}{suffix}' 1951 with _open(path, mode='wb') as f: 1952 f.write(data) 1953 if codec == 'gif': 1954 pixelated = h > first_image.shape[0] or w > first_image.shape[1] 1955 html_string = html_from_compressed_image( 1956 data, w, h, title=title, fmt='gif', pixelated=pixelated, **kwargs 1957 ) 1958 else: 1959 html_string = html_from_compressed_video( 1960 data, w, h, title=title, **kwargs 1961 ) 1962 html_strings.append(html_string) 1963 1964 # Create single-row tables each with no more than 'columns' elements. 1965 table_strings = [] 1966 for row_html_strings in _chunked(html_strings, columns): 1967 td = '<td style="padding:1px;">' 1968 s = ''.join(f'{td}{e}</td>' for e in row_html_strings) 1969 if ylabel: 1970 style = 'writing-mode:vertical-lr; transform:rotate(180deg);' 1971 s = f'{td}<span style="{style}">{ylabel}</span></td>' + s 1972 table_strings.append( 1973 f'<table class="{html_class}"' 1974 f' style="border-spacing:0px;"><tr>{s}</tr></table>' 1975 ) 1976 s = ''.join(table_strings) 1977 if return_html: 1978 return s 1979 _display_html(s) 1980 return None
Displays a row of videos in the IPython notebook.
Creates HTML with <video>
tags containing embedded H264-encoded bytestrings.
If codec
is set to 'gif', we instead use <img>
tags containing embedded
GIF-encoded bytestrings. Note that the resulting GIF animations skip frames
when the fps
period is not a multiple of 10 ms units (GIF frame delay
units). Encoding at fps
= 20.0, 25.0, or 50.0 works fine.
If a directory has been specified using set_show_save_dir
, also saves each
titled video to a file in that directory based on its title.
Arguments:
- videos: Iterable of videos, or dictionary of
{title: video}
. Each video must be an iterable of images. If a video object has ametadata
(VideoMetadata
) attribute, itsfps
field provides a default framerate. - titles: Optional strings shown above the corresponding videos.
- width: Optional, overrides displayed width (in pixels).
- height: Optional, overrides displayed height (in pixels).
- downsample: If True, each video whose width or height is greater than the
specified
width
orheight
is resampled to the display resolution. This improves antialiasing and reduces the size of the notebook. - columns: Optional, maximum number of videos per row.
- fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF).
- bps: Bits-per-second bitrate (default None).
- qp: Quantization parameter for video compression quality (default None).
- codec: Compression algorithm; must be either 'h264' or 'gif'.
- ylabel: Text (rotated by 90 degrees) shown on the left of each row.
- html_class: CSS class name used in definition of HTML element.
- return_html: If
True
return the raw HTMLstr
instead of displaying. - **kwargs: Additional parameters (
border
,loop
,autoplay
) forhtml_from_compressed_video
.
Returns:
html string if
return_html
isTrue
.
766def read_image( 767 path_or_url: _Path, 768 *, 769 apply_exif_transpose: bool = True, 770 dtype: _DTypeLike = None, 771) -> _NDArray: 772 """Returns an image read from a file path or URL. 773 774 Decoding is performed using `PIL`, which supports `uint8` images with 1, 3, 775 or 4 channels and `uint16` images with a single channel. 776 777 Args: 778 path_or_url: Path of input file. 779 apply_exif_transpose: If True, rotate image according to EXIF orientation. 780 dtype: Data type of the returned array. If None, `np.uint8` or `np.uint16` 781 is inferred automatically. 782 """ 783 data = read_contents(path_or_url) 784 return decompress_image(data, dtype, apply_exif_transpose)
Returns an image read from a file path or URL.
Decoding is performed using PIL
, which supports uint8
images with 1, 3,
or 4 channels and uint16
images with a single channel.
Arguments:
- path_or_url: Path of input file.
- apply_exif_transpose: If True, rotate image according to EXIF orientation.
- dtype: Data type of the returned array. If None,
np.uint8
ornp.uint16
is inferred automatically.
787def write_image( 788 path: _Path, image: _ArrayLike, fmt: str = 'png', **kwargs: Any 789) -> None: 790 """Writes an image to a file. 791 792 Encoding is performed using `PIL`, which supports `uint8` images with 1, 3, 793 or 4 channels and `uint16` images with a single channel. 794 795 File format is explicitly provided by `fmt` and not inferred by `path`. 796 797 Args: 798 path: Path of output file. 799 image: Array-like object. If its type is float, it is converted to np.uint8 800 using `to_uint8` (thus clamping to the input to the range [0.0, 1.0]). 801 Otherwise it must be np.uint8 or np.uint16. 802 fmt: Desired compression encoding, e.g. 'png'. 803 **kwargs: Additional parameters for `PIL.Image.save()`. 804 """ 805 image = _as_valid_media_array(image) 806 if np.issubdtype(image.dtype, np.floating): 807 image = to_uint8(image) 808 with _open(path, 'wb') as f: 809 _pil_image(image).save(f, format=fmt, **kwargs)
Writes an image to a file.
Encoding is performed using PIL
, which supports uint8
images with 1, 3,
or 4 channels and uint16
images with a single channel.
File format is explicitly provided by fmt
and not inferred by path
.
Arguments:
- path: Path of output file.
- image: Array-like object. If its type is float, it is converted to np.uint8
using
to_uint8
(thus clamping to the input to the range [0.0, 1.0]). Otherwise it must be np.uint8 or np.uint16. - fmt: Desired compression encoding, e.g. 'png'.
- **kwargs: Additional parameters for
PIL.Image.save()
.
1706def read_video(path_or_url: _Path, **kwargs: Any) -> _VideoArray: 1707 """Returns an array containing all images read from a compressed video file. 1708 1709 >>> video = read_video('/tmp/river.mp4') 1710 >>> print(f'The framerate is {video.metadata.fps} frames/s.') 1711 >>> show_video(video) 1712 1713 >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4' 1714 >>> show_video(read_video(url)) 1715 1716 Args: 1717 path_or_url: Input video file. 1718 **kwargs: Additional parameters for `VideoReader`. 1719 1720 Returns: 1721 A 4D `numpy` array with dimensions (frame, height, width, channel), or a 3D 1722 array if `output_format` is specified as 'gray'. The returned array has an 1723 attribute `metadata` containing `VideoMetadata` information. This enables 1724 `show_video` to retrieve the framerate in `metadata.fps`. Note that the 1725 metadata attribute is lost in most subsequent `numpy` operations. 1726 """ 1727 with VideoReader(path_or_url, **kwargs) as reader: 1728 return _VideoArray(np.array(tuple(reader)), metadata=reader.metadata)
Returns an array containing all images read from a compressed video file.
>>> video = read_video('/tmp/river.mp4')
>>> print(f'The framerate is {video.metadata.fps} frames/s.')
>>> show_video(video)
>>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
>>> show_video(read_video(url))
Arguments:
- path_or_url: Input video file.
- **kwargs: Additional parameters for
VideoReader
.
Returns:
A 4D
numpy
array with dimensions (frame, height, width, channel), or a 3D array ifoutput_format
is specified as 'gray'. The returned array has an attributemetadata
containingVideoMetadata
information. This enablesshow_video
to retrieve the framerate inmetadata.fps
. Note that the metadata attribute is lost in most subsequentnumpy
operations.
1731def write_video(path: _Path, images: Iterable[_NDArray], **kwargs: Any) -> None: 1732 """Writes images to a compressed video file. 1733 1734 >>> video = moving_circle((480, 640), num_images=60) 1735 >>> write_video('/tmp/v.mp4', video, fps=60, qp=18) 1736 >>> show_video(read_video('/tmp/v.mp4')) 1737 1738 Args: 1739 path: Output video file. 1740 images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D 1741 arrays. 1742 **kwargs: Additional parameters for `VideoWriter`. 1743 """ 1744 first_image, images = _peek_first(images) 1745 shape: tuple[int, int] = first_image.shape[:2] # type: ignore[assignment] 1746 dtype = first_image.dtype 1747 if dtype == bool: 1748 dtype = np.dtype(np.uint8) 1749 elif np.issubdtype(dtype, np.floating): 1750 dtype = np.dtype(np.uint16) 1751 kwargs = {'metadata': getattr(images, 'metadata', None), **kwargs} 1752 with VideoWriter(path, shape=shape, dtype=dtype, **kwargs) as writer: 1753 for image in images: 1754 writer.add_image(image)
Writes images to a compressed video file.
>>> video = moving_circle((480, 640), num_images=60)
>>> write_video('/tmp/v.mp4', video, fps=60, qp=18)
>>> show_video(read_video('/tmp/v.mp4'))
Arguments:
- path: Output video file.
- images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D arrays.
- **kwargs: Additional parameters for
VideoWriter
.
1281class VideoReader(_VideoIO): 1282 """Context to read a compressed video as an iterable over its images. 1283 1284 >>> with VideoReader('/tmp/river.mp4') as reader: 1285 ... print(f'Video has {reader.num_images} images with shape={reader.shape},' 1286 ... f' at {reader.fps} frames/sec and {reader.bps} bits/sec.') 1287 ... for image in reader: 1288 ... print(image.shape) 1289 1290 >>> with VideoReader('/tmp/river.mp4') as reader: 1291 ... video = np.array(tuple(reader)) 1292 1293 >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4' 1294 >>> with VideoReader(url) as reader: 1295 ... show_video(reader) 1296 1297 Attributes: 1298 path_or_url: Location of input video. 1299 output_format: Format of output images (default 'rgb'). If 'rgb', each 1300 image has shape=(height, width, 3) with R, G, B values. If 'yuv', each 1301 image has shape=(height, width, 3) with Y, U, V values. If 'gray', each 1302 image has shape=(height, width). 1303 dtype: Data type for output images. The default is `np.uint8`. Use of 1304 `np.uint16` allows reading 10-bit or 12-bit data without precision loss. 1305 metadata: Object storing the information retrieved from the video header. 1306 Its attributes are copied as attributes in this class. 1307 num_images: Number of frames that is expected from the video stream. This 1308 is estimated from the framerate and the duration stored in the video 1309 header, so it might be inexact. 1310 shape: The dimensions (height, width) of each video frame. 1311 fps: The framerate in frames per second. 1312 bps: The estimated bitrate of the video stream in bits per second, retrieved 1313 from the video header. 1314 """ 1315 1316 path_or_url: _Path 1317 output_format: str 1318 dtype: _DType 1319 metadata: VideoMetadata 1320 num_images: int 1321 shape: tuple[int, int] 1322 fps: float 1323 bps: int | None 1324 _num_bytes_per_image: int 1325 1326 def __init__( 1327 self, 1328 path_or_url: _Path, 1329 *, 1330 output_format: str = 'rgb', 1331 dtype: _DTypeLike = np.uint8, 1332 ): 1333 if output_format not in {'rgb', 'yuv', 'gray'}: 1334 raise ValueError( 1335 f'Output format {output_format} is not rgb, yuv, or gray.' 1336 ) 1337 self.path_or_url = path_or_url 1338 self.output_format = output_format 1339 self.dtype = np.dtype(dtype) 1340 if self.dtype.type not in (np.uint8, np.uint16): 1341 raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.') 1342 self._read_via_local_file: Any = None 1343 self._popen: subprocess.Popen[bytes] | None = None 1344 self._proc: subprocess.Popen[bytes] | None = None 1345 1346 def __enter__(self) -> 'VideoReader': 1347 ffmpeg_path = _get_ffmpeg_path() 1348 try: 1349 self._read_via_local_file = _read_via_local_file(self.path_or_url) 1350 # pylint: disable-next=no-member 1351 tmp_name = self._read_via_local_file.__enter__() 1352 1353 self.metadata = _get_video_metadata(tmp_name) 1354 self.num_images, self.shape, self.fps, self.bps = self.metadata 1355 pix_fmt = self._get_pix_fmt(self.dtype, self.output_format) 1356 num_channels = {'rgb': 3, 'yuv': 3, 'gray': 1}[self.output_format] 1357 bytes_per_channel = self.dtype.itemsize 1358 self._num_bytes_per_image = ( 1359 math.prod(self.shape) * num_channels * bytes_per_channel 1360 ) 1361 1362 command = [ 1363 ffmpeg_path, 1364 '-v', 1365 'panic', 1366 '-nostdin', 1367 '-i', 1368 tmp_name, 1369 '-vcodec', 1370 'rawvideo', 1371 '-f', 1372 'image2pipe', 1373 '-pix_fmt', 1374 pix_fmt, 1375 '-vsync', 1376 'vfr', 1377 '-', 1378 ] 1379 self._popen = subprocess.Popen( 1380 command, stdout=subprocess.PIPE, stderr=subprocess.PIPE 1381 ) 1382 self._proc = self._popen.__enter__() 1383 except Exception: 1384 self.__exit__(None, None, None) 1385 raise 1386 return self 1387 1388 def __exit__(self, *_: Any) -> None: 1389 self.close() 1390 1391 def read(self) -> _NDArray | None: 1392 """Reads a video image frame (or None if at end of file). 1393 1394 Returns: 1395 A numpy array in the format specified by `output_format`, i.e., a 3D 1396 array with 3 color channels, except for format 'gray' which is 2D. 1397 """ 1398 assert self._proc, 'Error: reading from an already closed context.' 1399 stdout = self._proc.stdout 1400 assert stdout is not None 1401 data = stdout.read(self._num_bytes_per_image) 1402 if not data: # Due to either end-of-file or subprocess error. 1403 self.close() # Raises exception if subprocess had error. 1404 return None # To indicate end-of-file. 1405 assert len(data) == self._num_bytes_per_image 1406 image = np.frombuffer(data, dtype=self.dtype) 1407 if self.output_format == 'rgb': 1408 image = image.reshape(*self.shape, 3) 1409 elif self.output_format == 'yuv': # Convert from planar YUV to pixel YUV. 1410 image = np.moveaxis(image.reshape(3, *self.shape), 0, 2) 1411 elif self.output_format == 'gray': # Generate 2D rather than 3D ndimage. 1412 image = image.reshape(*self.shape) 1413 else: 1414 raise AssertionError 1415 return image 1416 1417 def __iter__(self) -> Iterator[_NDArray]: 1418 while True: 1419 image = self.read() 1420 if image is None: 1421 return 1422 yield image 1423 1424 def close(self) -> None: 1425 """Terminates video reader. (Called automatically at end of context.)""" 1426 if self._popen: 1427 self._popen.__exit__(None, None, None) 1428 self._popen = None 1429 self._proc = None 1430 if self._read_via_local_file: 1431 # pylint: disable-next=no-member 1432 self._read_via_local_file.__exit__(None, None, None) 1433 self._read_via_local_file = None
Context to read a compressed video as an iterable over its images.
>>> with VideoReader('/tmp/river.mp4') as reader:
... print(f'Video has {reader.num_images} images with shape={reader.shape},'
... f' at {reader.fps} frames/sec and {reader.bps} bits/sec.')
... for image in reader:
... print(image.shape)
>>> with VideoReader('/tmp/river.mp4') as reader:
... video = np.array(tuple(reader))
>>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
>>> with VideoReader(url) as reader:
... show_video(reader)
Attributes:
- path_or_url: Location of input video.
- output_format: Format of output images (default 'rgb'). If 'rgb', each image has shape=(height, width, 3) with R, G, B values. If 'yuv', each image has shape=(height, width, 3) with Y, U, V values. If 'gray', each image has shape=(height, width).
- dtype: Data type for output images. The default is
np.uint8
. Use ofnp.uint16
allows reading 10-bit or 12-bit data without precision loss. - metadata: Object storing the information retrieved from the video header. Its attributes are copied as attributes in this class.
- num_images: Number of frames that is expected from the video stream. This is estimated from the framerate and the duration stored in the video header, so it might be inexact.
- shape: The dimensions (height, width) of each video frame.
- fps: The framerate in frames per second.
- bps: The estimated bitrate of the video stream in bits per second, retrieved from the video header.
1326 def __init__( 1327 self, 1328 path_or_url: _Path, 1329 *, 1330 output_format: str = 'rgb', 1331 dtype: _DTypeLike = np.uint8, 1332 ): 1333 if output_format not in {'rgb', 'yuv', 'gray'}: 1334 raise ValueError( 1335 f'Output format {output_format} is not rgb, yuv, or gray.' 1336 ) 1337 self.path_or_url = path_or_url 1338 self.output_format = output_format 1339 self.dtype = np.dtype(dtype) 1340 if self.dtype.type not in (np.uint8, np.uint16): 1341 raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.') 1342 self._read_via_local_file: Any = None 1343 self._popen: subprocess.Popen[bytes] | None = None 1344 self._proc: subprocess.Popen[bytes] | None = None
1391 def read(self) -> _NDArray | None: 1392 """Reads a video image frame (or None if at end of file). 1393 1394 Returns: 1395 A numpy array in the format specified by `output_format`, i.e., a 3D 1396 array with 3 color channels, except for format 'gray' which is 2D. 1397 """ 1398 assert self._proc, 'Error: reading from an already closed context.' 1399 stdout = self._proc.stdout 1400 assert stdout is not None 1401 data = stdout.read(self._num_bytes_per_image) 1402 if not data: # Due to either end-of-file or subprocess error. 1403 self.close() # Raises exception if subprocess had error. 1404 return None # To indicate end-of-file. 1405 assert len(data) == self._num_bytes_per_image 1406 image = np.frombuffer(data, dtype=self.dtype) 1407 if self.output_format == 'rgb': 1408 image = image.reshape(*self.shape, 3) 1409 elif self.output_format == 'yuv': # Convert from planar YUV to pixel YUV. 1410 image = np.moveaxis(image.reshape(3, *self.shape), 0, 2) 1411 elif self.output_format == 'gray': # Generate 2D rather than 3D ndimage. 1412 image = image.reshape(*self.shape) 1413 else: 1414 raise AssertionError 1415 return image
Reads a video image frame (or None if at end of file).
Returns:
A numpy array in the format specified by
output_format
, i.e., a 3D array with 3 color channels, except for format 'gray' which is 2D.
1424 def close(self) -> None: 1425 """Terminates video reader. (Called automatically at end of context.)""" 1426 if self._popen: 1427 self._popen.__exit__(None, None, None) 1428 self._popen = None 1429 self._proc = None 1430 if self._read_via_local_file: 1431 # pylint: disable-next=no-member 1432 self._read_via_local_file.__exit__(None, None, None) 1433 self._read_via_local_file = None
Terminates video reader. (Called automatically at end of context.)
1436class VideoWriter(_VideoIO): 1437 """Context to write a compressed video. 1438 1439 >>> shape = 480, 640 1440 >>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer: 1441 ... for image in moving_circle(shape, num_images=60): 1442 ... writer.add_image(image) 1443 >>> show_video(read_video('/tmp/v.mp4')) 1444 1445 1446 Bitrate control may be specified using at most one of: `bps`, `qp`, or `crf`. 1447 If none are specified, `qp` is set to a default value. 1448 See https://slhck.info/video/2017/03/01/rate-control.html 1449 1450 If codec is 'gif', the args `bps`, `qp`, `crf`, and `encoded_format` are 1451 ignored. 1452 1453 Attributes: 1454 path: Output video. Its suffix (e.g. '.mp4') determines the video container 1455 format. The suffix must be '.gif' if the codec is 'gif'. 1456 shape: 2D spatial dimensions (height, width) of video image frames. The 1457 dimensions must be even if 'encoded_format' has subsampled chroma (e.g., 1458 'yuv420p' or 'yuv420p10le'). 1459 codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264', 1460 'hevc', 'vp9', or 'gif'). 1461 metadata: Optional VideoMetadata object whose `fps` and `bps` attributes are 1462 used if not specified as explicit parameters. 1463 fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif'). 1464 bps: Requested average bits-per-second bitrate (default None). 1465 qp: Quantization parameter for video compression quality (default None). 1466 crf: Constant rate factor for video compression quality (default None). 1467 ffmpeg_args: Additional arguments for `ffmpeg` command, e.g. '-g 30' to 1468 introduce I-frames, or '-bf 0' to omit B-frames. 1469 input_format: Format of input images (default 'rgb'). If 'rgb', each image 1470 has shape=(height, width, 3) or (height, width). If 'yuv', each image has 1471 shape=(height, width, 3) with Y, U, V values. If 'gray', each image has 1472 shape=(height, width). 1473 dtype: Expected data type for input images (any float input images are 1474 converted to `dtype`). The default is `np.uint8`. Use of `np.uint16` is 1475 necessary when encoding >8 bits/channel. 1476 encoded_format: Pixel format as defined by `ffmpeg -pix_fmts`, e.g., 1477 'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma), 1478 'yuv420p10le' (10-bit per channel), etc. The default (None) selects 1479 'yuv420p' if all shape dimensions are even, else 'yuv444p'. 1480 """ 1481 1482 def __init__( 1483 self, 1484 path: _Path, 1485 shape: tuple[int, int], 1486 *, 1487 codec: str = 'h264', 1488 metadata: VideoMetadata | None = None, 1489 fps: float | None = None, 1490 bps: int | None = None, 1491 qp: int | None = None, 1492 crf: float | None = None, 1493 ffmpeg_args: str | Sequence[str] = '', 1494 input_format: str = 'rgb', 1495 dtype: _DTypeLike = np.uint8, 1496 encoded_format: str | None = None, 1497 ) -> None: 1498 _check_2d_shape(shape) 1499 if fps is None and metadata: 1500 fps = metadata.fps 1501 if fps is None: 1502 fps = 25.0 if codec == 'gif' else 60.0 1503 if fps <= 0.0: 1504 raise ValueError(f'Frame-per-second value {fps} is invalid.') 1505 if bps is None and metadata: 1506 bps = metadata.bps 1507 bps = int(bps) if bps is not None else None 1508 if bps is not None and bps <= 0: 1509 raise ValueError(f'Bitrate value {bps} is invalid.') 1510 if qp is not None and (not isinstance(qp, int) or qp <= 0): 1511 raise ValueError( 1512 f'Quantization parameter {qp} is not a positive integer.' 1513 ) 1514 num_rate_specifications = sum(x is not None for x in (bps, qp, crf)) 1515 if num_rate_specifications > 1: 1516 raise ValueError( 1517 f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).' 1518 ) 1519 ffmpeg_args = ( 1520 shlex.split(ffmpeg_args) 1521 if isinstance(ffmpeg_args, str) 1522 else list(ffmpeg_args) 1523 ) 1524 if input_format not in {'rgb', 'yuv', 'gray'}: 1525 raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.') 1526 dtype = np.dtype(dtype) 1527 if dtype.type not in (np.uint8, np.uint16): 1528 raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.') 1529 self.path = pathlib.Path(path) 1530 self.shape = shape 1531 all_dimensions_are_even = all(dim % 2 == 0 for dim in shape) 1532 if encoded_format is None: 1533 encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p' 1534 if not all_dimensions_are_even and encoded_format.startswith( 1535 ('yuv42', 'yuvj42') 1536 ): 1537 raise ValueError( 1538 f'With encoded_format {encoded_format}, video dimensions must be' 1539 f' even, but shape is {shape}.' 1540 ) 1541 self.fps = fps 1542 self.codec = codec 1543 self.bps = bps 1544 self.qp = qp 1545 self.crf = crf 1546 self.ffmpeg_args = ffmpeg_args 1547 self.input_format = input_format 1548 self.dtype = dtype 1549 self.encoded_format = encoded_format 1550 if num_rate_specifications == 0 and not ffmpeg_args: 1551 qp = 20 if math.prod(self.shape) <= 640 * 480 else 28 1552 self._bitrate_args = ( 1553 (['-vb', f'{bps}'] if bps is not None else []) 1554 + (['-qp', f'{qp}'] if qp is not None else []) 1555 + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else []) 1556 ) 1557 if self.codec == 'gif': 1558 if self.path.suffix != '.gif': 1559 raise ValueError(f"File '{self.path}' does not have a .gif suffix.") 1560 self.encoded_format = 'pal8' 1561 self._bitrate_args = [] 1562 video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse' 1563 # Less common (and likely less useful) is a per-frame color palette: 1564 # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];' 1565 # '[s1][p]paletteuse=new=1') 1566 self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args 1567 self._write_via_local_file: Any = None 1568 self._popen: subprocess.Popen[bytes] | None = None 1569 self._proc: subprocess.Popen[bytes] | None = None 1570 1571 def __enter__(self) -> 'VideoWriter': 1572 ffmpeg_path = _get_ffmpeg_path() 1573 input_pix_fmt = self._get_pix_fmt(self.dtype, self.input_format) 1574 try: 1575 self._write_via_local_file = _write_via_local_file(self.path) 1576 # pylint: disable-next=no-member 1577 tmp_name = self._write_via_local_file.__enter__() 1578 1579 # Writing to stdout using ('-f', 'mp4', '-') would require 1580 # ('-movflags', 'frag_keyframe+empty_moov') and the result is nonportable. 1581 height, width = self.shape 1582 command = ( 1583 [ 1584 ffmpeg_path, 1585 '-v', 1586 'error', 1587 '-f', 1588 'rawvideo', 1589 '-vcodec', 1590 'rawvideo', 1591 '-pix_fmt', 1592 input_pix_fmt, 1593 '-s', 1594 f'{width}x{height}', 1595 '-r', 1596 f'{self.fps}', 1597 '-i', 1598 '-', 1599 '-an', 1600 '-vcodec', 1601 self.codec, 1602 '-pix_fmt', 1603 self.encoded_format, 1604 ] 1605 + self._bitrate_args 1606 + self.ffmpeg_args 1607 + ['-y', tmp_name] 1608 ) 1609 self._popen = subprocess.Popen( 1610 command, stdin=subprocess.PIPE, stderr=subprocess.PIPE 1611 ) 1612 self._proc = self._popen.__enter__() 1613 except Exception: 1614 self.__exit__(None, None, None) 1615 raise 1616 return self 1617 1618 def __exit__(self, *_: Any) -> None: 1619 self.close() 1620 1621 def add_image(self, image: _NDArray) -> None: 1622 """Writes a video frame. 1623 1624 Args: 1625 image: Array whose dtype and first two dimensions must match the `dtype` 1626 and `shape` specified in `VideoWriter` initialization. If 1627 `input_format` is 'gray', the image must be 2D. For the 'rgb' 1628 input_format, the image may be either 2D (interpreted as grayscale) or 1629 3D with three (R, G, B) channels. For the 'yuv' input_format, the image 1630 must be 3D with three (Y, U, V) channels. 1631 1632 Raises: 1633 RuntimeError: If there is an error writing to the output file. 1634 """ 1635 assert self._proc, 'Error: writing to an already closed context.' 1636 if issubclass(image.dtype.type, (np.floating, np.bool_)): 1637 image = to_type(image, self.dtype) 1638 if image.dtype != self.dtype: 1639 raise ValueError(f'Image type {image.dtype} != {self.dtype}.') 1640 if self.input_format == 'gray': 1641 if image.ndim != 2: 1642 raise ValueError(f'Image dimensions {image.shape} are not 2D.') 1643 else: 1644 if image.ndim == 2 and self.input_format == 'rgb': 1645 image = np.dstack((image, image, image)) 1646 if not (image.ndim == 3 and image.shape[2] == 3): 1647 raise ValueError(f'Image dimensions {image.shape} are invalid.') 1648 if image.shape[:2] != self.shape: 1649 raise ValueError( 1650 f'Image dimensions {image.shape[:2]} do not match' 1651 f' those of the initialized video {self.shape}.' 1652 ) 1653 if self.input_format == 'yuv': # Convert from per-pixel YUV to planar YUV. 1654 image = np.moveaxis(image, 2, 0) 1655 data = image.tobytes() 1656 stdin = self._proc.stdin 1657 assert stdin is not None 1658 if stdin.write(data) != len(data): 1659 self._proc.wait() 1660 stderr = self._proc.stderr 1661 assert stderr is not None 1662 s = stderr.read().decode() 1663 raise RuntimeError(f"Error writing '{self.path}': {s}") 1664 1665 def close(self) -> None: 1666 """Finishes writing the video. (Called automatically at end of context.)""" 1667 if self._popen: 1668 assert self._proc, 'Error: closing an already closed context.' 1669 stdin = self._proc.stdin 1670 assert stdin is not None 1671 stdin.close() 1672 if self._proc.wait(): 1673 stderr = self._proc.stderr 1674 assert stderr is not None 1675 s = stderr.read().decode() 1676 raise RuntimeError(f"Error writing '{self.path}': {s}") 1677 self._popen.__exit__(None, None, None) 1678 self._popen = None 1679 self._proc = None 1680 if self._write_via_local_file: 1681 # pylint: disable-next=no-member 1682 self._write_via_local_file.__exit__(None, None, None) 1683 self._write_via_local_file = None
Context to write a compressed video.
>>> shape = 480, 640
>>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer:
... for image in moving_circle(shape, num_images=60):
... writer.add_image(image)
>>> show_video(read_video('/tmp/v.mp4'))
Bitrate control may be specified using at most one of: bps
, qp
, or crf
.
If none are specified, qp
is set to a default value.
See https://slhck.info/video/2017/03/01/rate-control.html
If codec is 'gif', the args bps
, qp
, crf
, and encoded_format
are
ignored.
Attributes:
- path: Output video. Its suffix (e.g. '.mp4') determines the video container format. The suffix must be '.gif' if the codec is 'gif'.
- shape: 2D spatial dimensions (height, width) of video image frames. The dimensions must be even if 'encoded_format' has subsampled chroma (e.g., 'yuv420p' or 'yuv420p10le').
- codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264', 'hevc', 'vp9', or 'gif').
- metadata: Optional VideoMetadata object whose
fps
andbps
attributes are used if not specified as explicit parameters. - fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif').
- bps: Requested average bits-per-second bitrate (default None).
- qp: Quantization parameter for video compression quality (default None).
- crf: Constant rate factor for video compression quality (default None).
- ffmpeg_args: Additional arguments for
ffmpeg
command, e.g. '-g 30' to introduce I-frames, or '-bf 0' to omit B-frames. - input_format: Format of input images (default 'rgb'). If 'rgb', each image has shape=(height, width, 3) or (height, width). If 'yuv', each image has shape=(height, width, 3) with Y, U, V values. If 'gray', each image has shape=(height, width).
- dtype: Expected data type for input images (any float input images are
converted to
dtype
). The default isnp.uint8
. Use ofnp.uint16
is necessary when encoding >8 bits/channel. - encoded_format: Pixel format as defined by
ffmpeg -pix_fmts
, e.g., 'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma), 'yuv420p10le' (10-bit per channel), etc. The default (None) selects 'yuv420p' if all shape dimensions are even, else 'yuv444p'.
1482 def __init__( 1483 self, 1484 path: _Path, 1485 shape: tuple[int, int], 1486 *, 1487 codec: str = 'h264', 1488 metadata: VideoMetadata | None = None, 1489 fps: float | None = None, 1490 bps: int | None = None, 1491 qp: int | None = None, 1492 crf: float | None = None, 1493 ffmpeg_args: str | Sequence[str] = '', 1494 input_format: str = 'rgb', 1495 dtype: _DTypeLike = np.uint8, 1496 encoded_format: str | None = None, 1497 ) -> None: 1498 _check_2d_shape(shape) 1499 if fps is None and metadata: 1500 fps = metadata.fps 1501 if fps is None: 1502 fps = 25.0 if codec == 'gif' else 60.0 1503 if fps <= 0.0: 1504 raise ValueError(f'Frame-per-second value {fps} is invalid.') 1505 if bps is None and metadata: 1506 bps = metadata.bps 1507 bps = int(bps) if bps is not None else None 1508 if bps is not None and bps <= 0: 1509 raise ValueError(f'Bitrate value {bps} is invalid.') 1510 if qp is not None and (not isinstance(qp, int) or qp <= 0): 1511 raise ValueError( 1512 f'Quantization parameter {qp} is not a positive integer.' 1513 ) 1514 num_rate_specifications = sum(x is not None for x in (bps, qp, crf)) 1515 if num_rate_specifications > 1: 1516 raise ValueError( 1517 f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).' 1518 ) 1519 ffmpeg_args = ( 1520 shlex.split(ffmpeg_args) 1521 if isinstance(ffmpeg_args, str) 1522 else list(ffmpeg_args) 1523 ) 1524 if input_format not in {'rgb', 'yuv', 'gray'}: 1525 raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.') 1526 dtype = np.dtype(dtype) 1527 if dtype.type not in (np.uint8, np.uint16): 1528 raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.') 1529 self.path = pathlib.Path(path) 1530 self.shape = shape 1531 all_dimensions_are_even = all(dim % 2 == 0 for dim in shape) 1532 if encoded_format is None: 1533 encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p' 1534 if not all_dimensions_are_even and encoded_format.startswith( 1535 ('yuv42', 'yuvj42') 1536 ): 1537 raise ValueError( 1538 f'With encoded_format {encoded_format}, video dimensions must be' 1539 f' even, but shape is {shape}.' 1540 ) 1541 self.fps = fps 1542 self.codec = codec 1543 self.bps = bps 1544 self.qp = qp 1545 self.crf = crf 1546 self.ffmpeg_args = ffmpeg_args 1547 self.input_format = input_format 1548 self.dtype = dtype 1549 self.encoded_format = encoded_format 1550 if num_rate_specifications == 0 and not ffmpeg_args: 1551 qp = 20 if math.prod(self.shape) <= 640 * 480 else 28 1552 self._bitrate_args = ( 1553 (['-vb', f'{bps}'] if bps is not None else []) 1554 + (['-qp', f'{qp}'] if qp is not None else []) 1555 + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else []) 1556 ) 1557 if self.codec == 'gif': 1558 if self.path.suffix != '.gif': 1559 raise ValueError(f"File '{self.path}' does not have a .gif suffix.") 1560 self.encoded_format = 'pal8' 1561 self._bitrate_args = [] 1562 video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse' 1563 # Less common (and likely less useful) is a per-frame color palette: 1564 # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];' 1565 # '[s1][p]paletteuse=new=1') 1566 self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args 1567 self._write_via_local_file: Any = None 1568 self._popen: subprocess.Popen[bytes] | None = None 1569 self._proc: subprocess.Popen[bytes] | None = None
1621 def add_image(self, image: _NDArray) -> None: 1622 """Writes a video frame. 1623 1624 Args: 1625 image: Array whose dtype and first two dimensions must match the `dtype` 1626 and `shape` specified in `VideoWriter` initialization. If 1627 `input_format` is 'gray', the image must be 2D. For the 'rgb' 1628 input_format, the image may be either 2D (interpreted as grayscale) or 1629 3D with three (R, G, B) channels. For the 'yuv' input_format, the image 1630 must be 3D with three (Y, U, V) channels. 1631 1632 Raises: 1633 RuntimeError: If there is an error writing to the output file. 1634 """ 1635 assert self._proc, 'Error: writing to an already closed context.' 1636 if issubclass(image.dtype.type, (np.floating, np.bool_)): 1637 image = to_type(image, self.dtype) 1638 if image.dtype != self.dtype: 1639 raise ValueError(f'Image type {image.dtype} != {self.dtype}.') 1640 if self.input_format == 'gray': 1641 if image.ndim != 2: 1642 raise ValueError(f'Image dimensions {image.shape} are not 2D.') 1643 else: 1644 if image.ndim == 2 and self.input_format == 'rgb': 1645 image = np.dstack((image, image, image)) 1646 if not (image.ndim == 3 and image.shape[2] == 3): 1647 raise ValueError(f'Image dimensions {image.shape} are invalid.') 1648 if image.shape[:2] != self.shape: 1649 raise ValueError( 1650 f'Image dimensions {image.shape[:2]} do not match' 1651 f' those of the initialized video {self.shape}.' 1652 ) 1653 if self.input_format == 'yuv': # Convert from per-pixel YUV to planar YUV. 1654 image = np.moveaxis(image, 2, 0) 1655 data = image.tobytes() 1656 stdin = self._proc.stdin 1657 assert stdin is not None 1658 if stdin.write(data) != len(data): 1659 self._proc.wait() 1660 stderr = self._proc.stderr 1661 assert stderr is not None 1662 s = stderr.read().decode() 1663 raise RuntimeError(f"Error writing '{self.path}': {s}")
Writes a video frame.
Arguments:
- image: Array whose dtype and first two dimensions must match the
dtype
andshape
specified inVideoWriter
initialization. Ifinput_format
is 'gray', the image must be 2D. For the 'rgb' input_format, the image may be either 2D (interpreted as grayscale) or 3D with three (R, G, B) channels. For the 'yuv' input_format, the image must be 3D with three (Y, U, V) channels.
Raises:
- RuntimeError: If there is an error writing to the output file.
1665 def close(self) -> None: 1666 """Finishes writing the video. (Called automatically at end of context.)""" 1667 if self._popen: 1668 assert self._proc, 'Error: closing an already closed context.' 1669 stdin = self._proc.stdin 1670 assert stdin is not None 1671 stdin.close() 1672 if self._proc.wait(): 1673 stderr = self._proc.stderr 1674 assert stderr is not None 1675 s = stderr.read().decode() 1676 raise RuntimeError(f"Error writing '{self.path}': {s}") 1677 self._popen.__exit__(None, None, None) 1678 self._popen = None 1679 self._proc = None 1680 if self._write_via_local_file: 1681 # pylint: disable-next=no-member 1682 self._write_via_local_file.__exit__(None, None, None) 1683 self._write_via_local_file = None
Finishes writing the video. (Called automatically at end of context.)
1186class VideoMetadata(NamedTuple): 1187 """Represents the data stored in a video container header. 1188 1189 Attributes: 1190 num_images: Number of frames that is expected from the video stream. This 1191 is estimated from the framerate and the duration stored in the video 1192 header, so it might be inexact. We set the value to -1 if number of 1193 frames is not found in the header. 1194 shape: The dimensions (height, width) of each video frame. 1195 fps: The framerate in frames per second. 1196 bps: The estimated bitrate of the video stream in bits per second, retrieved 1197 from the video header. 1198 """ 1199 1200 num_images: int 1201 shape: tuple[int, int] 1202 fps: float 1203 bps: int | None
Represents the data stored in a video container header.
Attributes:
- num_images: Number of frames that is expected from the video stream. This is estimated from the framerate and the duration stored in the video header, so it might be inexact. We set the value to -1 if number of frames is not found in the header.
- shape: The dimensions (height, width) of each video frame.
- fps: The framerate in frames per second.
- bps: The estimated bitrate of the video stream in bits per second, retrieved from the video header.
856def compress_image( 857 image: _ArrayLike, *, fmt: str = 'png', **kwargs: Any 858) -> bytes: 859 """Returns a buffer containing a compressed image. 860 861 Args: 862 image: Array in a format supported by `PIL`, e.g. np.uint8 or np.uint16. 863 fmt: Desired compression encoding, e.g. 'png'. 864 **kwargs: Options for `PIL.save()`, e.g. `optimize=True` for greater 865 compression. 866 """ 867 image = _as_valid_media_array(image) 868 with io.BytesIO() as output: 869 _pil_image(image).save(output, format=fmt, **kwargs) 870 return output.getvalue()
Returns a buffer containing a compressed image.
Arguments:
- image: Array in a format supported by
PIL
, e.g. np.uint8 or np.uint16. - fmt: Desired compression encoding, e.g. 'png'.
- **kwargs: Options for
PIL.save()
, e.g.optimize=True
for greater compression.
873def decompress_image( 874 data: bytes, dtype: _DTypeLike = None, apply_exif_transpose: bool = True 875) -> _NDArray: 876 """Returns an image from a compressed data buffer. 877 878 Decoding is performed using `PIL`, which supports `uint8` images with 1, 3, 879 or 4 channels and `uint16` images with a single channel. 880 881 Args: 882 data: Buffer containing compressed image. 883 dtype: Data type of the returned array. If None, `np.uint8` or `np.uint16` 884 is inferred automatically. 885 apply_exif_transpose: If True, rotate image according to EXIF orientation. 886 """ 887 pil_image = PIL.Image.open(io.BytesIO(data)) 888 if apply_exif_transpose: 889 tmp_image = PIL.ImageOps.exif_transpose(pil_image) # Future: in_place=True. 890 assert tmp_image 891 pil_image = tmp_image 892 if dtype is None: 893 dtype = np.uint16 if pil_image.mode.startswith('I') else np.uint8 894 return np.array(pil_image, dtype=dtype)
Returns an image from a compressed data buffer.
Decoding is performed using PIL
, which supports uint8
images with 1, 3,
or 4 channels and uint16
images with a single channel.
Arguments:
- data: Buffer containing compressed image.
- dtype: Data type of the returned array. If None,
np.uint8
ornp.uint16
is inferred automatically. - apply_exif_transpose: If True, rotate image according to EXIF orientation.
1757def compress_video( 1758 images: Iterable[_NDArray], *, codec: str = 'h264', **kwargs: Any 1759) -> bytes: 1760 """Returns a buffer containing a compressed video. 1761 1762 The video container is 'mp4' except when `codec` is 'gif'. 1763 1764 >>> video = read_video('/tmp/river.mp4') 1765 >>> data = compress_video(video, bps=10_000_000) 1766 >>> print(len(data)) 1767 1768 >>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10) 1769 1770 Args: 1771 images: Iterable over video frames. 1772 codec: Compression algorithm as defined by `ffmpeg -codecs` (e.g., 'h264', 1773 'hevc', 'vp9', or 'gif'). 1774 **kwargs: Additional parameters for `VideoWriter`. 1775 1776 Returns: 1777 A bytes buffer containing the compressed video. 1778 """ 1779 suffix = _filename_suffix_from_codec(codec) 1780 with tempfile.TemporaryDirectory() as directory_name: 1781 tmp_path = pathlib.Path(directory_name) / f'file{suffix}' 1782 write_video(tmp_path, images, codec=codec, **kwargs) 1783 return tmp_path.read_bytes()
Returns a buffer containing a compressed video.
The video container is 'mp4' except when codec
is 'gif'.
>>> video = read_video('/tmp/river.mp4')
>>> data = compress_video(video, bps=10_000_000)
>>> print(len(data))
>>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10)
Arguments:
- images: Iterable over video frames.
- codec: Compression algorithm as defined by
ffmpeg -codecs
(e.g., 'h264', 'hevc', 'vp9', or 'gif'). - **kwargs: Additional parameters for
VideoWriter
.
Returns:
A bytes buffer containing the compressed video.
1786def decompress_video(data: bytes, **kwargs: Any) -> _NDArray: 1787 """Returns video images from an MP4-compressed data buffer.""" 1788 with tempfile.TemporaryDirectory() as directory_name: 1789 tmp_path = pathlib.Path(directory_name) / 'file.mp4' 1790 tmp_path.write_bytes(data) 1791 return read_video(tmp_path, **kwargs)
Returns video images from an MP4-compressed data buffer.
897def html_from_compressed_image( 898 data: bytes, 899 width: int, 900 height: int, 901 *, 902 title: str | None = None, 903 border: bool | str = False, 904 pixelated: bool = True, 905 fmt: str = 'png', 906) -> str: 907 """Returns an HTML string with an image tag containing encoded data. 908 909 Args: 910 data: Compressed image bytes. 911 width: Width of HTML image in pixels. 912 height: Height of HTML image in pixels. 913 title: Optional text shown centered above image. 914 border: If `bool`, whether to place a black boundary around the image, or if 915 `str`, the boundary CSS style. 916 pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'. 917 fmt: Compression encoding. 918 """ 919 b64 = base64.b64encode(data).decode('utf-8') 920 if isinstance(border, str): 921 border = f'{border}; ' 922 elif border: 923 border = 'border:1px solid black; ' 924 else: 925 border = '' 926 s_pixelated = 'pixelated' if pixelated else 'auto' 927 s = ( 928 f'<img width="{width}" height="{height}"' 929 f' style="{border}image-rendering:{s_pixelated}; object-fit:cover;"' 930 f' src="data:image/{fmt};base64,{b64}"/>' 931 ) 932 if title is not None: 933 s = f"""<div style="display:flex; align-items:left;"> 934 <div style="display:flex; flex-direction:column; align-items:center;"> 935 <div>{title}</div><div>{s}</div></div></div>""" 936 return s
Returns an HTML string with an image tag containing encoded data.
Arguments:
- data: Compressed image bytes.
- width: Width of HTML image in pixels.
- height: Height of HTML image in pixels.
- title: Optional text shown centered above image.
- border: If
bool
, whether to place a black boundary around the image, or ifstr
, the boundary CSS style. - pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'.
- fmt: Compression encoding.
1794def html_from_compressed_video( 1795 data: bytes, 1796 width: int, 1797 height: int, 1798 *, 1799 title: str | None = None, 1800 border: bool | str = False, 1801 loop: bool = True, 1802 autoplay: bool = True, 1803) -> str: 1804 """Returns an HTML string with a video tag containing H264-encoded data. 1805 1806 Args: 1807 data: MP4-compressed video bytes. 1808 width: Width of HTML video in pixels. 1809 height: Height of HTML video in pixels. 1810 title: Optional text shown centered above the video. 1811 border: If `bool`, whether to place a black boundary around the image, or if 1812 `str`, the boundary CSS style. 1813 loop: If True, the playback repeats forever. 1814 autoplay: If True, video playback starts without having to click. 1815 """ 1816 b64 = base64.b64encode(data).decode('utf-8') 1817 if isinstance(border, str): 1818 border = f'{border}; ' 1819 elif border: 1820 border = 'border:1px solid black; ' 1821 else: 1822 border = '' 1823 options = ( 1824 f'controls width="{width}" height="{height}"' 1825 f' style="{border}object-fit:cover;"' 1826 f'{" loop" if loop else ""}' 1827 f'{" autoplay muted" if autoplay else ""}' 1828 ) 1829 s = f"""<video {options}> 1830 <source src="data:video/mp4;base64,{b64}" type="video/mp4"/> 1831 This browser does not support the video tag. 1832 </video>""" 1833 if title is not None: 1834 s = f"""<div style="display:flex; align-items:left;"> 1835 <div style="display:flex; flex-direction:column; align-items:center;"> 1836 <div>{title}</div><div>{s}</div></div></div>""" 1837 return s
Returns an HTML string with a video tag containing H264-encoded data.
Arguments:
- data: MP4-compressed video bytes.
- width: Width of HTML video in pixels.
- height: Height of HTML video in pixels.
- title: Optional text shown centered above the video.
- border: If
bool
, whether to place a black boundary around the image, or ifstr
, the boundary CSS style. - loop: If True, the playback repeats forever.
- autoplay: If True, video playback starts without having to click.
614def resize_image(image: _ArrayLike, shape: tuple[int, int]) -> _NDArray: 615 """Resizes image to specified spatial dimensions using a Lanczos filter. 616 617 Args: 618 image: Array-like 2D or 3D object, where dtype is uint or floating-point. 619 shape: 2D spatial dimensions (height, width) of output image. 620 621 Returns: 622 A resampled image whose spatial dimensions match `shape`. 623 """ 624 image = _as_valid_media_array(image) 625 if image.ndim not in (2, 3): 626 raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.') 627 _check_2d_shape(shape) 628 629 # A PIL image can be multichannel only if it has 3 or 4 uint8 channels, 630 # and it can be resized only if it is uint8 or float32. 631 supported_single_channel = ( 632 np.issubdtype(image.dtype, np.floating) or image.dtype == np.uint8 633 ) and image.ndim == 2 634 supported_multichannel = ( 635 image.dtype == np.uint8 and image.ndim == 3 and image.shape[2] in (3, 4) 636 ) 637 if supported_single_channel or supported_multichannel: 638 return np.array( 639 _pil_image(image).resize( 640 shape[::-1], resample=PIL.Image.Resampling.LANCZOS 641 ), 642 dtype=image.dtype, 643 ) 644 if image.ndim == 2: 645 # We convert to floating-point for resizing and convert back. 646 return to_type(resize_image(to_float01(image), shape), image.dtype) 647 # We resize each image channel individually. 648 return np.dstack( 649 [resize_image(channel, shape) for channel in np.moveaxis(image, -1, 0)] 650 )
Resizes image to specified spatial dimensions using a Lanczos filter.
Arguments:
- image: Array-like 2D or 3D object, where dtype is uint or floating-point.
- shape: 2D spatial dimensions (height, width) of output image.
Returns:
A resampled image whose spatial dimensions match
shape
.
656def resize_video(video: Iterable[_NDArray], shape: tuple[int, int]) -> _NDArray: 657 """Resizes `video` to specified spatial dimensions using a Lanczos filter. 658 659 Args: 660 video: Iterable of images. 661 shape: 2D spatial dimensions (height, width) of output video. 662 663 Returns: 664 A resampled video whose spatial dimensions match `shape`. 665 """ 666 _check_2d_shape(shape) 667 return np.array([resize_image(image, shape) for image in video])
Resizes video
to specified spatial dimensions using a Lanczos filter.
Arguments:
- video: Iterable of images.
- shape: 2D spatial dimensions (height, width) of output video.
Returns:
A resampled video whose spatial dimensions match
shape
.
812def to_rgb( 813 array: _ArrayLike, 814 *, 815 vmin: float | None = None, 816 vmax: float | None = None, 817 cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray', 818) -> _NDArray: 819 """Maps scalar values to RGB using value bounds and a color map. 820 821 Args: 822 array: Scalar values, with arbitrary shape. 823 vmin: Explicit min value for remapping; if None, it is obtained as the 824 minimum finite value of `array`. 825 vmax: Explicit max value for remapping; if None, it is obtained as the 826 maximum finite value of `array`. 827 cmap: A `pyplot` color map or callable, to map from 1D value to 3D or 4D 828 color. 829 830 Returns: 831 A new array in which each element is affinely mapped from [vmin, vmax] 832 to [0.0, 1.0] and then color-mapped. 833 """ 834 a = _as_valid_media_array(array) 835 del array 836 # For future numpy version 1.7.0: 837 # vmin = np.amin(a, where=np.isfinite(a)) if vmin is None else vmin 838 # vmax = np.amax(a, where=np.isfinite(a)) if vmax is None else vmax 839 vmin = np.amin(np.where(np.isfinite(a), a, np.inf)) if vmin is None else vmin 840 vmax = np.amax(np.where(np.isfinite(a), a, -np.inf)) if vmax is None else vmax 841 a = (a.astype('float') - vmin) / (vmax - vmin + np.finfo(float).eps) 842 if isinstance(cmap, str): 843 if hasattr(matplotlib, 'colormaps'): 844 rgb_from_scalar: Any = matplotlib.colormaps[cmap] # Newer version. 845 else: 846 rgb_from_scalar = matplotlib.pyplot.cm.get_cmap(cmap) # type: ignore # pylint: disable=no-member 847 else: 848 rgb_from_scalar = cmap 849 a = rgb_from_scalar(a) 850 # If there is a fully opaque alpha channel, remove it. 851 if a.shape[-1] == 4 and np.all(to_float01(a[..., 3])) == 1.0: 852 a = a[..., :3] 853 return a
Maps scalar values to RGB using value bounds and a color map.
Arguments:
- array: Scalar values, with arbitrary shape.
- vmin: Explicit min value for remapping; if None, it is obtained as the
minimum finite value of
array
. - vmax: Explicit max value for remapping; if None, it is obtained as the
maximum finite value of
array
. - cmap: A
pyplot
color map or callable, to map from 1D value to 3D or 4D color.
Returns:
A new array in which each element is affinely mapped from [vmin, vmax] to [0.0, 1.0] and then color-mapped.
375def to_type(array: _ArrayLike, dtype: _DTypeLike) -> _NDArray: 376 """Returns media array converted to specified type. 377 378 A "media array" is one in which the dtype is either a floating-point type 379 (np.float32 or np.float64) or an unsigned integer type. The array values are 380 assumed to lie in the range [0.0, 1.0] for floating-point values, and in the 381 full range for unsigned integers, e.g. [0, 255] for np.uint8. 382 383 Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to 384 1.0. The input array may also be of type bool, whereby True maps to 385 uint(MAX) or 1.0. The values are scaled and clamped as appropriate during 386 type conversions. 387 388 Args: 389 array: Input array-like object (floating-point, unsigned int, or bool). 390 dtype: Desired output type (floating-point or unsigned int). 391 392 Returns: 393 Array `a` if it is already of the specified dtype, else a converted array. 394 """ 395 a = np.asarray(array) 396 dtype = np.dtype(dtype) 397 del array 398 if a.dtype != bool: 399 _as_valid_media_type(a.dtype) # Verify that 'a' has a valid dtype. 400 if a.dtype == bool: 401 result = a.astype(dtype) 402 if np.issubdtype(dtype, np.unsignedinteger): 403 result = result * dtype.type(np.iinfo(dtype).max) 404 elif a.dtype == dtype: 405 result = a 406 elif np.issubdtype(dtype, np.unsignedinteger): 407 if np.issubdtype(a.dtype, np.unsignedinteger): 408 src_max: float = np.iinfo(a.dtype).max 409 else: 410 a = np.clip(a, 0.0, 1.0) 411 src_max = 1.0 412 dst_max = np.iinfo(dtype).max 413 if dst_max <= np.iinfo(np.uint16).max: 414 scale = np.array(dst_max / src_max, dtype=np.float32) 415 result = (a * scale + 0.5).astype(dtype) 416 elif dst_max <= np.iinfo(np.uint32).max: 417 result = (a.astype(np.float64) * (dst_max / src_max) + 0.5).astype(dtype) 418 else: 419 # https://stackoverflow.com/a/66306123/ 420 a = a.astype(np.float64) * (dst_max / src_max) + 0.5 421 dst = np.atleast_1d(a) 422 values_too_large = dst >= np.float64(dst_max) 423 with np.errstate(invalid='ignore'): 424 dst = dst.astype(dtype) 425 dst[values_too_large] = dst_max 426 result = dst if a.ndim > 0 else dst[0] 427 else: 428 assert np.issubdtype(dtype, np.floating) 429 result = a.astype(dtype) 430 if np.issubdtype(a.dtype, np.unsignedinteger): 431 result = result / dtype.type(np.iinfo(a.dtype).max) 432 return result
Returns media array converted to specified type.
A "media array" is one in which the dtype is either a floating-point type (np.float32 or np.float64) or an unsigned integer type. The array values are assumed to lie in the range [0.0, 1.0] for floating-point values, and in the full range for unsigned integers, e.g. [0, 255] for np.uint8.
Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to 1.0. The input array may also be of type bool, whereby True maps to uint(MAX) or 1.0. The values are scaled and clamped as appropriate during type conversions.
Arguments:
- array: Input array-like object (floating-point, unsigned int, or bool).
- dtype: Desired output type (floating-point or unsigned int).
Returns:
Array
a
if it is already of the specified dtype, else a converted array.
435def to_float01(a: _ArrayLike, dtype: _DTypeLike = np.float32) -> _NDArray: 436 """If array has unsigned integers, rescales them to the range [0.0, 1.0]. 437 438 Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0. See 439 `to_type`. 440 441 Args: 442 a: Input array. 443 dtype: Desired floating-point type if rescaling occurs. 444 445 Returns: 446 A new array of dtype values in the range [0.0, 1.0] if the input array `a` 447 contains unsigned integers; otherwise, array `a` is returned unchanged. 448 """ 449 a = np.asarray(a) 450 dtype = np.dtype(dtype) 451 if not np.issubdtype(dtype, np.floating): 452 raise ValueError(f'Type {dtype} is not floating-point.') 453 if np.issubdtype(a.dtype, np.floating): 454 return a 455 return to_type(a, dtype)
If array has unsigned integers, rescales them to the range [0.0, 1.0].
Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0. See
to_type
.
Arguments:
- a: Input array.
- dtype: Desired floating-point type if rescaling occurs.
Returns:
A new array of dtype values in the range [0.0, 1.0] if the input array
a
contains unsigned integers; otherwise, arraya
is returned unchanged.
458def to_uint8(a: _ArrayLike) -> _NDArray: 459 """Returns array converted to uint8 values; see `to_type`.""" 460 return to_type(a, np.uint8)
Returns array converted to uint8 values; see to_type
.
328def set_output_height(num_pixels: int) -> None: 329 """Overrides the height of the current output cell, if using Colab.""" 330 try: 331 # We want to fail gracefully for non-Colab IPython notebooks. 332 output = importlib.import_module('google.colab.output') 333 s = f'google.colab.output.setIframeHeight("{num_pixels}px")' 334 output.eval_js(s) 335 except (ModuleNotFoundError, AttributeError): 336 pass
Overrides the height of the current output cell, if using Colab.
339def set_max_output_height(num_pixels: int) -> None: 340 """Sets the maximum height of the current output cell, if using Colab.""" 341 try: 342 # We want to fail gracefully for non-Colab IPython notebooks. 343 output = importlib.import_module('google.colab.output') 344 s = ( 345 'google.colab.output.setIframeHeight(' 346 f'0, true, {{maxHeight: {num_pixels}}})' 347 ) 348 output.eval_js(s) 349 except (ModuleNotFoundError, AttributeError): 350 pass
Sets the maximum height of the current output cell, if using Colab.
466def color_ramp( 467 shape: tuple[int, int] = (64, 64), *, dtype: _DTypeLike = np.float32 468) -> _NDArray: 469 """Returns an image of a red-green color gradient. 470 471 This is useful for quick experimentation and testing. See also 472 `moving_circle` to generate a sample video. 473 474 Args: 475 shape: 2D spatial dimensions (height, width) of generated image. 476 dtype: Type (uint or floating) of resulting pixel values. 477 """ 478 _check_2d_shape(shape) 479 dtype = _as_valid_media_type(dtype) 480 yx = (np.moveaxis(np.indices(shape), 0, -1) + 0.5) / shape 481 image = np.insert(yx, 2, 0.0, axis=-1) 482 return to_type(image, dtype)
Returns an image of a red-green color gradient.
This is useful for quick experimentation and testing. See also
moving_circle
to generate a sample video.
Arguments:
- shape: 2D spatial dimensions (height, width) of generated image.
- dtype: Type (uint or floating) of resulting pixel values.
485def moving_circle( 486 shape: tuple[int, int] = (256, 256), 487 num_images: int = 10, 488 *, 489 dtype: _DTypeLike = np.float32, 490) -> _NDArray: 491 """Returns a video of a circle moving in front of a color ramp. 492 493 This is useful for quick experimentation and testing. See also `color_ramp` 494 to generate a sample image. 495 496 >>> show_video(moving_circle((480, 640), 60), fps=60) 497 498 Args: 499 shape: 2D spatial dimensions (height, width) of generated video. 500 num_images: Number of video frames. 501 dtype: Type (uint or floating) of resulting pixel values. 502 """ 503 _check_2d_shape(shape) 504 dtype = np.dtype(dtype) 505 506 def generate_image(image_index: int) -> _NDArray: 507 """Returns a video frame image.""" 508 image = color_ramp(shape, dtype=dtype) 509 yx = np.moveaxis(np.indices(shape), 0, -1) 510 center = shape[0] * 0.6, shape[1] * (image_index + 0.5) / num_images 511 radius_squared = (min(shape) * 0.1) ** 2 512 inside = np.sum((yx - center) ** 2, axis=-1) < radius_squared 513 white_circle_color = 1.0, 1.0, 1.0 514 if np.issubdtype(dtype, np.unsignedinteger): 515 white_circle_color = to_type([white_circle_color], dtype)[0] 516 image[inside] = white_circle_color 517 return image 518 519 return np.array([generate_image(i) for i in range(num_images)])
Returns a video of a circle moving in front of a color ramp.
This is useful for quick experimentation and testing. See also color_ramp
to generate a sample image.
>>> show_video(moving_circle((480, 640), 60), fps=60)
Arguments:
- shape: 2D spatial dimensions (height, width) of generated video.
- num_images: Number of video frames.
- dtype: Type (uint or floating) of resulting pixel values.
733class set_show_save_dir: # pylint: disable=invalid-name 734 """Save all titled output from `show_*()` calls into files. 735 736 If the specified `directory` is not None, all titled images and videos 737 displayed by `show_image`, `show_images`, `show_video`, and `show_videos` are 738 also saved as files within the directory. 739 740 It can be used either to set the state or as a context manager: 741 742 >>> set_show_save_dir('/tmp') 743 >>> show_image(color_ramp(), title='image1') # Creates /tmp/image1.png. 744 >>> show_video(moving_circle(), title='video2') # Creates /tmp/video2.mp4. 745 >>> set_show_save_dir(None) 746 747 >>> with set_show_save_dir('/tmp'): 748 ... show_image(color_ramp(), title='image1') # Creates /tmp/image1.png. 749 ... show_video(moving_circle(), title='video2') # Creates /tmp/video2.mp4. 750 """ 751 752 def __init__(self, directory: _Path | None): 753 self._old_show_save_dir = _config.show_save_dir 754 _config.show_save_dir = directory 755 756 def __enter__(self) -> None: 757 pass 758 759 def __exit__(self, *_: Any) -> None: 760 _config.show_save_dir = self._old_show_save_dir
Save all titled output from show_*()
calls into files.
If the specified directory
is not None, all titled images and videos
displayed by show_image
, show_images
, show_video
, and show_videos
are
also saved as files within the directory.
It can be used either to set the state or as a context manager:
>>> set_show_save_dir('/tmp')
>>> show_image(color_ramp(), title='image1') # Creates /tmp/image1.png.
>>> show_video(moving_circle(), title='video2') # Creates /tmp/video2.mp4.
>>> set_show_save_dir(None)
>>> with set_show_save_dir('/tmp'):
... show_image(color_ramp(), title='image1') # Creates /tmp/image1.png.
... show_video(moving_circle(), title='video2') # Creates /tmp/video2.mp4.
314def set_ffmpeg(name_or_path: _Path) -> None: 315 """Specifies the name or path for the `ffmpeg` external program. 316 317 The `ffmpeg` program is required for compressing and decompressing video. 318 (It is used in `read_video`, `write_video`, `show_video`, `show_videos`, 319 etc.) 320 321 Args: 322 name_or_path: Either a filename within a directory of `os.environ['PATH']` 323 or a filepath. The default setting is 'ffmpeg'. 324 """ 325 _config.ffmpeg_name_or_path = name_or_path
Specifies the name or path for the ffmpeg
external program.
The ffmpeg
program is required for compressing and decompressing video.
(It is used in read_video
, write_video
, show_video
, show_videos
,
etc.)
Arguments:
- name_or_path: Either a filename within a directory of
os.environ['PATH']
or a filepath. The default setting is 'ffmpeg'.
1178def video_is_available() -> bool: 1179 """Returns True if the program `ffmpeg` is found. 1180 1181 See also `set_ffmpeg`. 1182 """ 1183 return _search_for_ffmpeg_path() is not None
Returns True if the program ffmpeg
is found.
See also set_ffmpeg
.