mediapy
mediapy: Read/write/show images and videos in an IPython/Jupyter notebook.
[GitHub source] [API docs] [PyPI package] [Colab example]
See the example notebook, or better yet, open it in Colab.
Image examples
Display an image (2D or 3D numpy array):
checkerboard = np.kron([[0, 1] * 16, [1, 0] * 16] * 16, np.ones((4, 4)))
show_image(checkerboard)
Read and display an image (either local or from the Web):
IMAGE = 'https://github.com/hhoppe/data/raw/main/image.png'
show_image(read_image(IMAGE))
Read and display an image from a local file:
!wget -q -O /tmp/burano.png {IMAGE}
show_image(read_image('/tmp/burano.png'))
Show titled images side-by-side:
images = {
'original': checkerboard,
'darkened': checkerboard * 0.7,
'random': np.random.rand(32, 32, 3),
}
show_images(images, vmin=0.0, vmax=1.0, border=True, height=64)
Compare two images using an interactive slider:
compare_images([checkerboard, np.random.rand(128, 128, 3)])
Video examples
Display a video (an iterable of images, e.g., a 3D or 4D array):
video = moving_circle((100, 100), num_images=10)
show_video(video, fps=10)
Show the video frames side-by-side:
show_images(video, columns=6, border=True, height=64)
Show the frames with their indices:
show_images({f'{i}': image for i, image in enumerate(video)}, width=32)
Read and display a video (either local or from the Web):
VIDEO = 'https://github.com/hhoppe/data/raw/main/video.mp4'
show_video(read_video(VIDEO))
Create and display a looping two-frame GIF video:
image1 = resize_image(np.random.rand(10, 10, 3), (50, 50))
show_video([image1, image1 * 0.8], fps=2, codec='gif')
Darken a video frame-by-frame:
output_path = '/tmp/out.mp4'
with VideoReader(VIDEO) as r:
darken_image = lambda image: to_float01(image) * 0.5
with VideoWriter(output_path, shape=r.shape, fps=r.fps, bps=r.bps) as w:
for image in r:
w.add_image(darken_image(image))
1# Copyright 2025 The mediapy Authors. 2# 3# Licensed under the Apache License, Version 2.0 (the "License"); 4# you may not use this file except in compliance with the License. 5# You may obtain a copy of the License at 6# 7# http://www.apache.org/licenses/LICENSE-2.0 8# 9# Unless required by applicable law or agreed to in writing, software 10# distributed under the License is distributed on an "AS IS" BASIS, 11# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12# See the License for the specific language governing permissions and 13# limitations under the License. 14 15"""`mediapy`: Read/write/show images and videos in an IPython/Jupyter notebook. 16 17[**[GitHub source]**](https://github.com/google/mediapy) 18[**[API docs]**](https://google.github.io/mediapy/) 19[**[PyPI package]**](https://pypi.org/project/mediapy/) 20[**[Colab 21example]**](https://colab.research.google.com/github/google/mediapy/blob/main/mediapy_examples.ipynb) 22 23See the [example 24notebook](https://github.com/google/mediapy/blob/main/mediapy_examples.ipynb), 25or better yet, [**open it in 26Colab**](https://colab.research.google.com/github/google/mediapy/blob/main/mediapy_examples.ipynb). 27 28## Image examples 29 30Display an image (2D or 3D `numpy` array): 31```python 32checkerboard = np.kron([[0, 1] * 16, [1, 0] * 16] * 16, np.ones((4, 4))) 33show_image(checkerboard) 34``` 35 36Read and display an image (either local or from the Web): 37```python 38IMAGE = 'https://github.com/hhoppe/data/raw/main/image.png' 39show_image(read_image(IMAGE)) 40``` 41 42Read and display an image from a local file: 43```python 44!wget -q -O /tmp/burano.png {IMAGE} 45show_image(read_image('/tmp/burano.png')) 46``` 47 48Show titled images side-by-side: 49```python 50images = { 51 'original': checkerboard, 52 'darkened': checkerboard * 0.7, 53 'random': np.random.rand(32, 32, 3), 54} 55show_images(images, vmin=0.0, vmax=1.0, border=True, height=64) 56``` 57 58Compare two images using an interactive slider: 59```python 60compare_images([checkerboard, np.random.rand(128, 128, 3)]) 61``` 62 63## Video examples 64 65Display a video (an iterable of images, e.g., a 3D or 4D array): 66```python 67video = moving_circle((100, 100), num_images=10) 68show_video(video, fps=10) 69``` 70 71Show the video frames side-by-side: 72```python 73show_images(video, columns=6, border=True, height=64) 74``` 75 76Show the frames with their indices: 77```python 78show_images({f'{i}': image for i, image in enumerate(video)}, width=32) 79``` 80 81Read and display a video (either local or from the Web): 82```python 83VIDEO = 'https://github.com/hhoppe/data/raw/main/video.mp4' 84show_video(read_video(VIDEO)) 85``` 86 87Create and display a looping two-frame GIF video: 88```python 89image1 = resize_image(np.random.rand(10, 10, 3), (50, 50)) 90show_video([image1, image1 * 0.8], fps=2, codec='gif') 91``` 92 93Darken a video frame-by-frame: 94```python 95output_path = '/tmp/out.mp4' 96with VideoReader(VIDEO) as r: 97 darken_image = lambda image: to_float01(image) * 0.5 98 with VideoWriter(output_path, shape=r.shape, fps=r.fps, bps=r.bps) as w: 99 for image in r: 100 w.add_image(darken_image(image)) 101``` 102""" 103 104from __future__ import annotations 105 106__docformat__ = 'google' 107__version__ = '1.2.5' 108__version_info__ = tuple(int(num) for num in __version__.split('.')) 109 110import base64 111from collections.abc import Callable, Iterable, Iterator, Mapping, Sequence 112import contextlib 113import functools 114import importlib 115import io 116import itertools 117import math 118import numbers 119import os # Package only needed for typing.TYPE_CHECKING. 120import pathlib 121import re 122import shlex 123import shutil 124import subprocess 125import sys 126import tempfile 127import typing 128from typing import Any 129import urllib.request 130import warnings 131 132import IPython.display 133import matplotlib.pyplot 134import numpy as np 135import numpy.typing as npt 136import PIL.Image 137import PIL.ImageOps 138 139 140if not hasattr(PIL.Image, 'Resampling'): # Allow Pillow<9.0. 141 PIL.Image.Resampling = PIL.Image # type: ignore 142 143# Selected and reordered here for pdoc documentation. 144__all__ = [ 145 'show_image', 146 'show_images', 147 'compare_images', 148 'show_video', 149 'show_videos', 150 'read_image', 151 'write_image', 152 'read_video', 153 'write_video', 154 'VideoReader', 155 'VideoWriter', 156 'VideoMetadata', 157 'compress_image', 158 'decompress_image', 159 'compress_video', 160 'decompress_video', 161 'html_from_compressed_image', 162 'html_from_compressed_video', 163 'resize_image', 164 'resize_video', 165 'to_rgb', 166 'to_type', 167 'to_float01', 168 'to_uint8', 169 'set_output_height', 170 'set_max_output_height', 171 'color_ramp', 172 'moving_circle', 173 'set_show_save_dir', 174 'set_ffmpeg', 175 'video_is_available', 176] 177 178if TYPE_CHECKING: 179 _ArrayLike = npt.ArrayLike 180 _DTypeLike = npt.DTypeLike 181 _NDArray = npt.NDArray[Any] 182 _DType = np.dtype[Any] 183else: 184 # Create named types for use in the `pdoc` documentation. 185 _ArrayLike = TypeVar('_ArrayLike') 186 _DTypeLike = TypeVar('_DTypeLike') 187 _NDArray = TypeVar('_NDArray') 188 _DType = TypeVar('_DType') # pylint: disable=invalid-name 189 190_IPYTHON_HTML_SIZE_LIMIT = 10**10 # Unlimited seems to be OK now. 191_T = TypeVar('_T') 192_Path = Union[str, 'os.PathLike[str]'] 193 194_IMAGE_COMPARISON_HTML = """\ 195<script 196 defer 197 src="https://unpkg.com/img-comparison-slider@7/dist/index.js" 198></script> 199<link 200 rel="stylesheet" 201 href="https://unpkg.com/img-comparison-slider@7/dist/styles.css" 202/> 203 204<img-comparison-slider> 205 <img slot="first" src="data:image/png;base64,{b64_1}" /> 206 <img slot="second" src="data:image/png;base64,{b64_2}" /> 207</img-comparison-slider> 208""" 209 210# ** Miscellaneous. 211 212 213class _Config: 214 ffmpeg_name_or_path: _Path = 'ffmpeg' 215 show_save_dir: _Path | None = None 216 217 218_config = _Config() 219 220 221def _open(path: _Path, *args: Any, **kwargs: Any) -> Any: 222 """Opens the file; this is a hook for the built-in `open()`.""" 223 return open(path, *args, **kwargs) 224 225 226def _path_is_local(path: _Path) -> bool: 227 """Returns True if the path is in the filesystem accessible by `ffmpeg`.""" 228 del path 229 return True 230 231 232def _search_for_ffmpeg_path() -> str | None: 233 """Returns a path to the ffmpeg program, or None if not found.""" 234 if filename := shutil.which(_config.ffmpeg_name_or_path): 235 return str(filename) 236 return None 237 238 239def _print_err(*args: str, **kwargs: Any) -> None: 240 """Prints arguments to stderr immediately.""" 241 kwargs = {**dict(file=sys.stderr, flush=True), **kwargs} 242 print(*args, **kwargs) 243 244 245def _chunked( 246 iterable: Iterable[_T], n: int | None = None 247) -> Iterator[tuple[_T, ...]]: 248 """Returns elements collected as tuples of length at most `n` if not None.""" 249 250 def take(n: int | None, iterable: Iterable[_T]) -> tuple[_T, ...]: 251 return tuple(itertools.islice(iterable, n)) 252 253 return iter(functools.partial(take, n, iter(iterable)), ()) 254 255 256def _peek_first(iterator: Iterable[_T]) -> tuple[_T, Iterable[_T]]: 257 """Given an iterator, returns first element and re-initialized iterator. 258 259 >>> first_image, images = _peek_first(moving_circle()) 260 261 Args: 262 iterator: An input iterator or iterable. 263 264 Returns: 265 A tuple (first_element, iterator_reinitialized) containing: 266 first_element: The first element of the input. 267 iterator_reinitialized: A clone of the original iterator/iterable. 268 """ 269 # Inspired from https://stackoverflow.com/a/12059829/1190077 270 peeker, iterator_reinitialized = itertools.tee(iterator) 271 first = next(peeker) 272 return first, iterator_reinitialized 273 274 275def _check_2d_shape(shape: tuple[int, int]) -> None: 276 """Checks that `shape` is of the form (height, width) with two integers.""" 277 if len(shape) != 2: 278 raise ValueError(f'Shape {shape} is not of the form (height, width).') 279 if not all(isinstance(i, numbers.Integral) for i in shape): 280 raise ValueError(f'Shape {shape} contains non-integers.') 281 282 283def _run(args: str | Sequence[str]) -> None: 284 """Executes command, printing output from stdout and stderr. 285 286 Args: 287 args: Command to execute, which can be either a string or a sequence of word 288 strings, as in `subprocess.run()`. If `args` is a string, the shell is 289 invoked to interpret it. 290 291 Raises: 292 RuntimeError: If the command's exit code is nonzero. 293 """ 294 proc = subprocess.run( 295 args, 296 shell=isinstance(args, str), 297 stdout=subprocess.PIPE, 298 stderr=subprocess.STDOUT, 299 check=False, 300 universal_newlines=True, 301 ) 302 print(proc.stdout, end='', flush=True) 303 if proc.returncode: 304 raise RuntimeError( 305 f"Command '{proc.args}' failed with code {proc.returncode}." 306 ) 307 308 309def _display_html(text: str, /) -> None: 310 """In a Jupyter notebook, display the HTML `text`.""" 311 IPython.display.display(IPython.display.HTML(text)) # type: ignore 312 313 314def set_ffmpeg(name_or_path: _Path) -> None: 315 """Specifies the name or path for the `ffmpeg` external program. 316 317 The `ffmpeg` program is required for compressing and decompressing video. 318 (It is used in `read_video`, `write_video`, `show_video`, `show_videos`, 319 etc.) 320 321 Args: 322 name_or_path: Either a filename within a directory of `os.environ['PATH']` 323 or a filepath. The default setting is 'ffmpeg'. 324 """ 325 _config.ffmpeg_name_or_path = name_or_path 326 327 328def set_output_height(num_pixels: int) -> None: 329 """Overrides the height of the current output cell, if using Colab.""" 330 try: 331 # We want to fail gracefully for non-Colab IPython notebooks. 332 output = importlib.import_module('google.colab.output') 333 s = f'google.colab.output.setIframeHeight("{num_pixels}px")' 334 output.eval_js(s) 335 except (ModuleNotFoundError, AttributeError): 336 pass 337 338 339def set_max_output_height(num_pixels: int) -> None: 340 """Sets the maximum height of the current output cell, if using Colab.""" 341 try: 342 # We want to fail gracefully for non-Colab IPython notebooks. 343 output = importlib.import_module('google.colab.output') 344 s = ( 345 'google.colab.output.setIframeHeight(' 346 f'0, true, {{maxHeight: {num_pixels}}})' 347 ) 348 output.eval_js(s) 349 except (ModuleNotFoundError, AttributeError): 350 pass 351 352 353# ** Type conversions. 354 355 356def _as_valid_media_type(dtype: _DTypeLike) -> _DType: 357 """Returns validated media data type.""" 358 dtype = np.dtype(dtype) 359 if not issubclass(dtype.type, (np.unsignedinteger, np.floating)): 360 raise ValueError( 361 f'Type {dtype} is not a valid media data type (uint or float).' 362 ) 363 return dtype 364 365 366def _as_valid_media_array(x: _ArrayLike) -> _NDArray: 367 """Converts to ndarray (if not already), and checks validity of data type.""" 368 a = np.asarray(x) 369 if a.dtype == bool: 370 a = a.astype(np.uint8) * np.iinfo(np.uint8).max 371 _as_valid_media_type(a.dtype) 372 return a 373 374 375def to_type(array: _ArrayLike, dtype: _DTypeLike) -> _NDArray: 376 """Returns media array converted to specified type. 377 378 A "media array" is one in which the dtype is either a floating-point type 379 (np.float32 or np.float64) or an unsigned integer type. The array values are 380 assumed to lie in the range [0.0, 1.0] for floating-point values, and in the 381 full range for unsigned integers, e.g. [0, 255] for np.uint8. 382 383 Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to 384 1.0. The input array may also be of type bool, whereby True maps to 385 uint(MAX) or 1.0. The values are scaled and clamped as appropriate during 386 type conversions. 387 388 Args: 389 array: Input array-like object (floating-point, unsigned int, or bool). 390 dtype: Desired output type (floating-point or unsigned int). 391 392 Returns: 393 Array `a` if it is already of the specified dtype, else a converted array. 394 """ 395 a = np.asarray(array) 396 dtype = np.dtype(dtype) 397 del array 398 if a.dtype != bool: 399 _as_valid_media_type(a.dtype) # Verify that 'a' has a valid dtype. 400 if a.dtype == bool: 401 result = a.astype(dtype) 402 if np.issubdtype(dtype, np.unsignedinteger): 403 result = result * dtype.type(np.iinfo(dtype).max) 404 elif a.dtype == dtype: 405 result = a 406 elif np.issubdtype(dtype, np.unsignedinteger): 407 if np.issubdtype(a.dtype, np.unsignedinteger): 408 src_max: float = np.iinfo(a.dtype).max 409 else: 410 a = np.clip(a, 0.0, 1.0) 411 src_max = 1.0 412 dst_max = np.iinfo(dtype).max 413 if dst_max <= np.iinfo(np.uint16).max: 414 scale = np.array(dst_max / src_max, dtype=np.float32) 415 result = (a * scale + 0.5).astype(dtype) 416 elif dst_max <= np.iinfo(np.uint32).max: 417 result = (a.astype(np.float64) * (dst_max / src_max) + 0.5).astype(dtype) 418 else: 419 # https://stackoverflow.com/a/66306123/ 420 a = a.astype(np.float64) * (dst_max / src_max) + 0.5 421 dst = np.atleast_1d(a) 422 values_too_large = dst >= np.float64(dst_max) 423 with np.errstate(invalid='ignore'): 424 dst = dst.astype(dtype) 425 dst[values_too_large] = dst_max 426 result = dst if a.ndim > 0 else dst[0] 427 else: 428 assert np.issubdtype(dtype, np.floating) 429 result = a.astype(dtype) 430 if np.issubdtype(a.dtype, np.unsignedinteger): 431 result = result / dtype.type(np.iinfo(a.dtype).max) 432 return result 433 434 435def to_float01(a: _ArrayLike, dtype: _DTypeLike = np.float32) -> _NDArray: 436 """If array has unsigned integers, rescales them to the range [0.0, 1.0]. 437 438 Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0. See 439 `to_type`. 440 441 Args: 442 a: Input array. 443 dtype: Desired floating-point type if rescaling occurs. 444 445 Returns: 446 A new array of dtype values in the range [0.0, 1.0] if the input array `a` 447 contains unsigned integers; otherwise, array `a` is returned unchanged. 448 """ 449 a = np.asarray(a) 450 dtype = np.dtype(dtype) 451 if not np.issubdtype(dtype, np.floating): 452 raise ValueError(f'Type {dtype} is not floating-point.') 453 if np.issubdtype(a.dtype, np.floating): 454 return a 455 return to_type(a, dtype) 456 457 458def to_uint8(a: _ArrayLike) -> _NDArray: 459 """Returns array converted to uint8 values; see `to_type`.""" 460 return to_type(a, np.uint8) 461 462 463# ** Functions to generate example image and video data. 464 465 466def color_ramp( 467 shape: tuple[int, int] = (64, 64), *, dtype: _DTypeLike = np.float32 468) -> _NDArray: 469 """Returns an image of a red-green color gradient. 470 471 This is useful for quick experimentation and testing. See also 472 `moving_circle` to generate a sample video. 473 474 Args: 475 shape: 2D spatial dimensions (height, width) of generated image. 476 dtype: Type (uint or floating) of resulting pixel values. 477 """ 478 _check_2d_shape(shape) 479 dtype = _as_valid_media_type(dtype) 480 yx = (np.moveaxis(np.indices(shape), 0, -1) + 0.5) / shape 481 image = np.insert(yx, 2, 0.0, axis=-1) 482 return to_type(image, dtype) 483 484 485def moving_circle( 486 shape: tuple[int, int] = (256, 256), 487 num_images: int = 10, 488 *, 489 dtype: _DTypeLike = np.float32, 490) -> _NDArray: 491 """Returns a video of a circle moving in front of a color ramp. 492 493 This is useful for quick experimentation and testing. See also `color_ramp` 494 to generate a sample image. 495 496 >>> show_video(moving_circle((480, 640), 60), fps=60) 497 498 Args: 499 shape: 2D spatial dimensions (height, width) of generated video. 500 num_images: Number of video frames. 501 dtype: Type (uint or floating) of resulting pixel values. 502 """ 503 _check_2d_shape(shape) 504 dtype = np.dtype(dtype) 505 506 def generate_image(image_index: int) -> _NDArray: 507 """Returns a video frame image.""" 508 image = color_ramp(shape, dtype=dtype) 509 yx = np.moveaxis(np.indices(shape), 0, -1) 510 center = shape[0] * 0.6, shape[1] * (image_index + 0.5) / num_images 511 radius_squared = (min(shape) * 0.1) ** 2 512 inside = np.sum((yx - center) ** 2, axis=-1) < radius_squared 513 white_circle_color = 1.0, 1.0, 1.0 514 if np.issubdtype(dtype, np.unsignedinteger): 515 white_circle_color = to_type([white_circle_color], dtype)[0] 516 image[inside] = white_circle_color 517 return image 518 519 return np.array([generate_image(i) for i in range(num_images)]) 520 521 522# ** Color-space conversions. 523 524# Same matrix values as in two sources: 525# https://github.com/scikit-image/scikit-image/blob/master/skimage/color/colorconv.py#L377 526# https://github.com/tensorflow/tensorflow/blob/r1.14/tensorflow/python/ops/image_ops_impl.py#L2754 527_YUV_FROM_RGB_MATRIX = np.array( 528 [ 529 [0.299, -0.14714119, 0.61497538], 530 [0.587, -0.28886916, -0.51496512], 531 [0.114, 0.43601035, -0.10001026], 532 ], 533 dtype=np.float32, 534) 535_RGB_FROM_YUV_MATRIX = np.linalg.inv(_YUV_FROM_RGB_MATRIX) 536_YUV_CHROMA_OFFSET = np.array([0.0, 0.5, 0.5], dtype=np.float32) 537 538 539def yuv_from_rgb(rgb: _ArrayLike) -> _NDArray: 540 """Returns the RGB image/video mapped to YUV [0,1] color space. 541 542 Note that the "YUV" color space used by video compressors is actually YCbCr! 543 544 Args: 545 rgb: Input image in sRGB space. 546 """ 547 rgb = to_float01(rgb) 548 if rgb.shape[-1] != 3: 549 raise ValueError(f'The last dimension in {rgb.shape} is not 3.') 550 return rgb @ _YUV_FROM_RGB_MATRIX + _YUV_CHROMA_OFFSET 551 552 553def rgb_from_yuv(yuv: _ArrayLike) -> _NDArray: 554 """Returns the YUV image/video mapped to RGB [0,1] color space.""" 555 yuv = to_float01(yuv) 556 if yuv.shape[-1] != 3: 557 raise ValueError(f'The last dimension in {yuv.shape} is not 3.') 558 return (yuv - _YUV_CHROMA_OFFSET) @ _RGB_FROM_YUV_MATRIX 559 560 561# Same matrix values as in 562# https://github.com/scikit-image/scikit-image/blob/master/skimage/color/colorconv.py#L1654 563# and https://en.wikipedia.org/wiki/YUV#Studio_swing_for_BT.601 564_YCBCR_FROM_RGB_MATRIX = np.array( 565 [ 566 [65.481, 128.553, 24.966], 567 [-37.797, -74.203, 112.0], 568 [112.0, -93.786, -18.214], 569 ], 570 dtype=np.float32, 571).transpose() 572_RGB_FROM_YCBCR_MATRIX = np.linalg.inv(_YCBCR_FROM_RGB_MATRIX) 573_YCBCR_OFFSET = np.array([16.0, 128.0, 128.0], dtype=np.float32) 574# Note that _YCBCR_FROM_RGB_MATRIX =~ _YUV_FROM_RGB_MATRIX * [219, 256, 182]; 575# https://en.wikipedia.org/wiki/YUV: "Y' values are conventionally shifted and 576# scaled to the range [16, 235] (referred to as studio swing or 'TV levels')"; 577# "studio range of 16-240 for U and V". (Where does value 182 come from?) 578 579 580def ycbcr_from_rgb(rgb: _ArrayLike) -> _NDArray: 581 """Returns the RGB image/video mapped to YCbCr [0,1] color space. 582 583 The YCbCr color space is the one called "YUV" by video compressors. 584 585 Args: 586 rgb: Input image in sRGB space. 587 """ 588 rgb = to_float01(rgb) 589 if rgb.shape[-1] != 3: 590 raise ValueError(f'The last dimension in {rgb.shape} is not 3.') 591 return (rgb @ _YCBCR_FROM_RGB_MATRIX + _YCBCR_OFFSET) / 255.0 592 593 594def rgb_from_ycbcr(ycbcr: _ArrayLike) -> _NDArray: 595 """Returns the YCbCr image/video mapped to RGB [0,1] color space.""" 596 ycbcr = to_float01(ycbcr) 597 if ycbcr.shape[-1] != 3: 598 raise ValueError(f'The last dimension in {ycbcr.shape} is not 3.') 599 return (ycbcr * 255.0 - _YCBCR_OFFSET) @ _RGB_FROM_YCBCR_MATRIX 600 601 602# ** Image processing. 603 604 605def _pil_image(image: _ArrayLike, mode: str | None = None) -> PIL.Image.Image: 606 """Returns a PIL image given a numpy matrix (either uint8 or float [0,1]).""" 607 image = _as_valid_media_array(image) 608 if image.ndim not in (2, 3): 609 raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.') 610 pil_image: PIL.Image.Image = PIL.Image.fromarray(image, mode=mode) 611 return pil_image 612 613 614def resize_image(image: _ArrayLike, shape: tuple[int, int]) -> _NDArray: 615 """Resizes image to specified spatial dimensions using a Lanczos filter. 616 617 Args: 618 image: Array-like 2D or 3D object, where dtype is uint or floating-point. 619 shape: 2D spatial dimensions (height, width) of output image. 620 621 Returns: 622 A resampled image whose spatial dimensions match `shape`. 623 """ 624 image = _as_valid_media_array(image) 625 if image.ndim not in (2, 3): 626 raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.') 627 _check_2d_shape(shape) 628 629 # A PIL image can be multichannel only if it has 3 or 4 uint8 channels, 630 # and it can be resized only if it is uint8 or float32. 631 supported_single_channel = ( 632 np.issubdtype(image.dtype, np.floating) or image.dtype == np.uint8 633 ) and image.ndim == 2 634 supported_multichannel = ( 635 image.dtype == np.uint8 and image.ndim == 3 and image.shape[2] in (3, 4) 636 ) 637 if supported_single_channel or supported_multichannel: 638 return np.array( 639 _pil_image(image).resize( 640 shape[::-1], resample=PIL.Image.Resampling.LANCZOS 641 ), 642 dtype=image.dtype, 643 ) 644 if image.ndim == 2: 645 # We convert to floating-point for resizing and convert back. 646 return to_type(resize_image(to_float01(image), shape), image.dtype) 647 # We resize each image channel individually. 648 return np.dstack( 649 [resize_image(channel, shape) for channel in np.moveaxis(image, -1, 0)] 650 ) 651 652 653# ** Video processing. 654 655 656def resize_video(video: Iterable[_NDArray], shape: tuple[int, int]) -> _NDArray: 657 """Resizes `video` to specified spatial dimensions using a Lanczos filter. 658 659 Args: 660 video: Iterable of images. 661 shape: 2D spatial dimensions (height, width) of output video. 662 663 Returns: 664 A resampled video whose spatial dimensions match `shape`. 665 """ 666 _check_2d_shape(shape) 667 return np.array([resize_image(image, shape) for image in video]) 668 669 670# ** General I/O. 671 672 673def _is_url(path_or_url: _Path) -> bool: 674 return isinstance(path_or_url, str) and path_or_url.startswith( 675 ('http://', 'https://', 'file://') 676 ) 677 678 679def read_contents(path_or_url: _Path) -> bytes: 680 """Returns the contents of the file specified by either a path or URL.""" 681 data: bytes 682 if _is_url(path_or_url): 683 assert isinstance(path_or_url, str) 684 headers = {'User-Agent': 'Chrome'} 685 request = urllib.request.Request(path_or_url, headers=headers) 686 with urllib.request.urlopen(request) as response: 687 data = response.read() 688 else: 689 with _open(path_or_url, 'rb') as f: 690 data = f.read() 691 return data 692 693 694@contextlib.contextmanager 695def _read_via_local_file(path_or_url: _Path) -> Iterator[str]: 696 """Context to copy a remote file locally to read from it. 697 698 Args: 699 path_or_url: File, which may be remote. 700 701 Yields: 702 The name of a local file which may be a copy of a remote file. 703 """ 704 if _is_url(path_or_url) or not _path_is_local(path_or_url): 705 suffix = pathlib.Path(path_or_url).suffix 706 with tempfile.TemporaryDirectory() as directory_name: 707 tmp_path = pathlib.Path(directory_name) / f'file{suffix}' 708 tmp_path.write_bytes(read_contents(path_or_url)) 709 yield str(tmp_path) 710 else: 711 yield str(path_or_url) 712 713 714@contextlib.contextmanager 715def _write_via_local_file(path: _Path) -> Iterator[str]: 716 """Context to write a temporary local file and subsequently copy it remotely. 717 718 Args: 719 path: File, which may be remote. 720 721 Yields: 722 The name of a local file which may be subsequently copied remotely. 723 """ 724 if _path_is_local(path): 725 yield str(path) 726 else: 727 suffix = pathlib.Path(path).suffix 728 with tempfile.TemporaryDirectory() as directory_name: 729 tmp_path = pathlib.Path(directory_name) / f'file{suffix}' 730 yield str(tmp_path) 731 with _open(path, mode='wb') as f: 732 f.write(tmp_path.read_bytes()) 733 734 735class set_show_save_dir: # pylint: disable=invalid-name 736 """Save all titled output from `show_*()` calls into files. 737 738 If the specified `directory` is not None, all titled images and videos 739 displayed by `show_image`, `show_images`, `show_video`, and `show_videos` are 740 also saved as files within the directory. 741 742 It can be used either to set the state or as a context manager: 743 744 >>> set_show_save_dir('/tmp') 745 >>> show_image(color_ramp(), title='image1') # Creates /tmp/image1.png. 746 >>> show_video(moving_circle(), title='video2') # Creates /tmp/video2.mp4. 747 >>> set_show_save_dir(None) 748 749 >>> with set_show_save_dir('/tmp'): 750 ... show_image(color_ramp(), title='image1') # Creates /tmp/image1.png. 751 ... show_video(moving_circle(), title='video2') # Creates /tmp/video2.mp4. 752 """ 753 754 def __init__(self, directory: _Path | None): 755 self._old_show_save_dir = _config.show_save_dir 756 _config.show_save_dir = directory 757 758 def __enter__(self) -> None: 759 pass 760 761 def __exit__(self, *_: Any) -> None: 762 _config.show_save_dir = self._old_show_save_dir 763 764 765# ** Image I/O. 766 767 768def read_image( 769 path_or_url: _Path, 770 *, 771 apply_exif_transpose: bool = True, 772 dtype: _DTypeLike = None, 773) -> _NDArray: 774 """Returns an image read from a file path or URL. 775 776 Decoding is performed using `PIL`, which supports `uint8` images with 1, 3, 777 or 4 channels and `uint16` images with a single channel. 778 779 Args: 780 path_or_url: Path of input file. 781 apply_exif_transpose: If True, rotate image according to EXIF orientation. 782 dtype: Data type of the returned array. If None, `np.uint8` or `np.uint16` 783 is inferred automatically. 784 """ 785 data = read_contents(path_or_url) 786 return decompress_image(data, dtype, apply_exif_transpose) 787 788 789def write_image( 790 path: _Path, image: _ArrayLike, fmt: str = 'png', **kwargs: Any 791) -> None: 792 """Writes an image to a file. 793 794 Encoding is performed using `PIL`, which supports `uint8` images with 1, 3, 795 or 4 channels and `uint16` images with a single channel. 796 797 File format is explicitly provided by `fmt` and not inferred by `path`. 798 799 Args: 800 path: Path of output file. 801 image: Array-like object. If its type is float, it is converted to np.uint8 802 using `to_uint8` (thus clamping to the input to the range [0.0, 1.0]). 803 Otherwise it must be np.uint8 or np.uint16. 804 fmt: Desired compression encoding, e.g. 'png'. 805 **kwargs: Additional parameters for `PIL.Image.save()`. 806 """ 807 image = _as_valid_media_array(image) 808 if np.issubdtype(image.dtype, np.floating): 809 image = to_uint8(image) 810 with _open(path, 'wb') as f: 811 _pil_image(image).save(f, format=fmt, **kwargs) 812 813 814def to_rgb( 815 array: _ArrayLike, 816 *, 817 vmin: float | None = None, 818 vmax: float | None = None, 819 cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray', 820) -> _NDArray: 821 """Maps scalar values to RGB using value bounds and a color map. 822 823 Args: 824 array: Scalar values, with arbitrary shape. 825 vmin: Explicit min value for remapping; if None, it is obtained as the 826 minimum finite value of `array`. 827 vmax: Explicit max value for remapping; if None, it is obtained as the 828 maximum finite value of `array`. 829 cmap: A `pyplot` color map or callable, to map from 1D value to 3D or 4D 830 color. 831 832 Returns: 833 A new array in which each element is affinely mapped from [vmin, vmax] 834 to [0.0, 1.0] and then color-mapped. 835 """ 836 a = _as_valid_media_array(array) 837 del array 838 # For future numpy version 1.7.0: 839 # vmin = np.amin(a, where=np.isfinite(a)) if vmin is None else vmin 840 # vmax = np.amax(a, where=np.isfinite(a)) if vmax is None else vmax 841 vmin = np.amin(np.where(np.isfinite(a), a, np.inf)) if vmin is None else vmin 842 vmax = np.amax(np.where(np.isfinite(a), a, -np.inf)) if vmax is None else vmax 843 a = (a.astype('float') - vmin) / (vmax - vmin + np.finfo(float).eps) 844 if isinstance(cmap, str): 845 if hasattr(matplotlib, 'colormaps'): 846 rgb_from_scalar: Any = matplotlib.colormaps[cmap] # Newer version. 847 else: 848 rgb_from_scalar = matplotlib.pyplot.cm.get_cmap(cmap) # pylint: disable=no-member 849 else: 850 rgb_from_scalar = cmap 851 a = cast(_NDArray, rgb_from_scalar(a)) 852 # If there is a fully opaque alpha channel, remove it. 853 if a.shape[-1] == 4 and np.all(to_float01(a[..., 3])) == 1.0: 854 a = a[..., :3] 855 return a 856 857 858def compress_image( 859 image: _ArrayLike, *, fmt: str = 'png', **kwargs: Any 860) -> bytes: 861 """Returns a buffer containing a compressed image. 862 863 Args: 864 image: Array in a format supported by `PIL`, e.g. np.uint8 or np.uint16. 865 fmt: Desired compression encoding, e.g. 'png'. 866 **kwargs: Options for `PIL.save()`, e.g. `optimize=True` for greater 867 compression. 868 """ 869 image = _as_valid_media_array(image) 870 with io.BytesIO() as output: 871 _pil_image(image).save(output, format=fmt, **kwargs) 872 return output.getvalue() 873 874 875def decompress_image( 876 data: bytes, dtype: _DTypeLike = None, apply_exif_transpose: bool = True 877) -> _NDArray: 878 """Returns an image from a compressed data buffer. 879 880 Decoding is performed using `PIL`, which supports `uint8` images with 1, 3, 881 or 4 channels and `uint16` images with a single channel. 882 883 Args: 884 data: Buffer containing compressed image. 885 dtype: Data type of the returned array. If None, `np.uint8` or `np.uint16` 886 is inferred automatically. 887 apply_exif_transpose: If True, rotate image according to EXIF orientation. 888 """ 889 pil_image: PIL.Image.Image = PIL.Image.open(io.BytesIO(data)) 890 if apply_exif_transpose: 891 tmp_image = PIL.ImageOps.exif_transpose(pil_image) # Future: in_place=True. 892 assert tmp_image 893 pil_image = tmp_image 894 if dtype is None: 895 dtype = np.uint16 if pil_image.mode.startswith('I') else np.uint8 896 return np.array(pil_image, dtype=dtype) 897 898 899def html_from_compressed_image( 900 data: bytes, 901 width: int, 902 height: int, 903 *, 904 title: str | None = None, 905 border: bool | str = False, 906 pixelated: bool = True, 907 fmt: str = 'png', 908) -> str: 909 """Returns an HTML string with an image tag containing encoded data. 910 911 Args: 912 data: Compressed image bytes. 913 width: Width of HTML image in pixels. 914 height: Height of HTML image in pixels. 915 title: Optional text shown centered above image. 916 border: If `bool`, whether to place a black boundary around the image, or if 917 `str`, the boundary CSS style. 918 pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'. 919 fmt: Compression encoding. 920 """ 921 b64 = base64.b64encode(data).decode('utf-8') 922 if isinstance(border, str): 923 border = f'{border}; ' 924 elif border: 925 border = 'border:1px solid black; ' 926 else: 927 border = '' 928 s_pixelated = 'pixelated' if pixelated else 'auto' 929 s = ( 930 f'<img width="{width}" height="{height}"' 931 f' style="{border}image-rendering:{s_pixelated}; object-fit:cover;"' 932 f' src="data:image/{fmt};base64,{b64}"/>' 933 ) 934 if title is not None: 935 s = f"""<div style="display:flex; align-items:left;"> 936 <div style="display:flex; flex-direction:column; align-items:center;"> 937 <div>{title}</div><div>{s}</div></div></div>""" 938 return s 939 940 941def _get_width_height( 942 width: int | None, height: int | None, shape: tuple[int, int] 943) -> tuple[int, int]: 944 """Returns (width, height) given optional parameters and image shape.""" 945 assert len(shape) == 2, shape 946 if width and height: 947 return width, height 948 if width and not height: 949 return width, int(width * (shape[0] / shape[1]) + 0.5) 950 if height and not width: 951 return int(height * (shape[1] / shape[0]) + 0.5), height 952 return shape[::-1] 953 954 955def _ensure_mapped_to_rgb( 956 image: _ArrayLike, 957 *, 958 vmin: float | None = None, 959 vmax: float | None = None, 960 cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray', 961) -> _NDArray: 962 """Ensure image is mapped to RGB.""" 963 image = _as_valid_media_array(image) 964 if not (image.ndim == 2 or (image.ndim == 3 and image.shape[2] in (1, 3, 4))): 965 raise ValueError( 966 f'Image with shape {image.shape} is neither a 2D array' 967 ' nor a 3D array with 1, 3, or 4 channels.' 968 ) 969 if image.ndim == 3 and image.shape[2] == 1: 970 image = image[:, :, 0] 971 if image.ndim == 2: 972 image = to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap) 973 return image 974 975 976def show_image( 977 image: _ArrayLike, *, title: str | None = None, **kwargs: Any 978) -> str | None: 979 """Displays an image in the notebook and optionally saves it to a file. 980 981 See `show_images`. 982 983 >>> show_image(np.random.rand(100, 100)) 984 >>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8')) 985 >>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100) 986 >>> show_image(read_image('/tmp/image.png')) 987 >>> url = 'https://github.com/hhoppe/data/raw/main/image.png' 988 >>> show_image(read_image(url)) 989 990 Args: 991 image: 2D array-like, or 3D array-like with 1, 3, or 4 channels. 992 title: Optional text shown centered above the image. 993 **kwargs: See `show_images`. 994 995 Returns: 996 html string if `return_html` is `True`. 997 """ 998 return show_images([np.asarray(image)], [title], **kwargs) 999 1000 1001def show_images( 1002 images: Iterable[_ArrayLike] | Mapping[str, _ArrayLike], 1003 titles: Iterable[str | None] | None = None, 1004 *, 1005 width: int | None = None, 1006 height: int | None = None, 1007 downsample: bool = True, 1008 columns: int | None = None, 1009 vmin: float | None = None, 1010 vmax: float | None = None, 1011 cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray', 1012 border: bool | str = False, 1013 ylabel: str = '', 1014 html_class: str = 'show_images', 1015 pixelated: bool | None = None, 1016 return_html: bool = False, 1017) -> str | None: 1018 """Displays a row of images in the IPython/Jupyter notebook. 1019 1020 If a directory has been specified using `set_show_save_dir`, also saves each 1021 titled image to a file in that directory based on its title. 1022 1023 >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64)) 1024 >>> show_images([image1, image2]) 1025 >>> show_images({'random image': image1, 'color ramp': image2}, height=128) 1026 >>> show_images([image1, image2] * 5, columns=4, border=True) 1027 1028 Args: 1029 images: Iterable of images, or dictionary of `{title: image}`. Each image 1030 must be either a 2D array or a 3D array with 1, 3, or 4 channels. 1031 titles: Optional strings shown above the corresponding images. 1032 width: Optional, overrides displayed width (in pixels). 1033 height: Optional, overrides displayed height (in pixels). 1034 downsample: If True, each image whose width or height is greater than the 1035 specified `width` or `height` is resampled to the display resolution. This 1036 improves antialiasing and reduces the size of the notebook. 1037 columns: Optional, maximum number of images per row. 1038 vmin: For single-channel image, explicit min value for display. 1039 vmax: For single-channel image, explicit max value for display. 1040 cmap: For single-channel image, `pyplot` color map or callable to map 1D to 1041 3D color. 1042 border: If `bool`, whether to place a black boundary around the image, or if 1043 `str`, the boundary CSS style. 1044 ylabel: Text (rotated by 90 degrees) shown on the left of each row. 1045 html_class: CSS class name used in definition of HTML element. 1046 pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if 1047 False, sets 'image-rendering: auto'; if None, uses pixelated rendering 1048 only on images for which `width` or `height` introduces magnification. 1049 return_html: If `True` return the raw HTML `str` instead of displaying. 1050 1051 Returns: 1052 html string if `return_html` is `True`. 1053 """ 1054 if isinstance(images, Mapping): 1055 if titles is not None: 1056 raise ValueError('Cannot have images dictionary and titles parameter.') 1057 list_titles, list_images = list(images.keys()), list(images.values()) 1058 else: 1059 list_images = list(images) 1060 list_titles = [None] * len(list_images) if titles is None else list(titles) 1061 if len(list_images) != len(list_titles): 1062 raise ValueError( 1063 'Number of images does not match number of titles' 1064 f' ({len(list_images)} vs {len(list_titles)}).' 1065 ) 1066 1067 list_images = [ 1068 _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap) 1069 for image in list_images 1070 ] 1071 1072 def maybe_downsample(image: _NDArray) -> _NDArray: 1073 shape = image.shape[0], image.shape[1] 1074 w, h = _get_width_height(width, height, shape) 1075 if w < shape[1] or h < shape[0]: 1076 image = resize_image(image, (h, w)) 1077 return image 1078 1079 if downsample: 1080 list_images = [maybe_downsample(image) for image in list_images] 1081 png_datas = [compress_image(to_uint8(image)) for image in list_images] 1082 1083 for title, png_data in zip(list_titles, png_datas): 1084 if title is not None and _config.show_save_dir: 1085 path = pathlib.Path(_config.show_save_dir) / f'{title}.png' 1086 with _open(path, mode='wb') as f: 1087 f.write(png_data) 1088 1089 def html_from_compressed_images() -> str: 1090 html_strings = [] 1091 for image, title, png_data in zip(list_images, list_titles, png_datas): 1092 w, h = _get_width_height(width, height, image.shape[:2]) 1093 magnified = h > image.shape[0] or w > image.shape[1] 1094 pixelated2 = pixelated if pixelated is not None else magnified 1095 html_strings.append( 1096 html_from_compressed_image( 1097 png_data, w, h, title=title, border=border, pixelated=pixelated2 1098 ) 1099 ) 1100 # Create single-row tables each with no more than 'columns' elements. 1101 table_strings = [] 1102 for row_html_strings in _chunked(html_strings, columns): 1103 td = '<td style="padding:1px;">' 1104 s = ''.join(f'{td}{e}</td>' for e in row_html_strings) 1105 if ylabel: 1106 style = 'writing-mode:vertical-lr; transform:rotate(180deg);' 1107 s = f'{td}<span style="{style}">{ylabel}</span></td>' + s 1108 table_strings.append( 1109 f'<table class="{html_class}"' 1110 f' style="border-spacing:0px;"><tr>{s}</tr></table>' 1111 ) 1112 return ''.join(table_strings) 1113 1114 s = html_from_compressed_images() 1115 while len(s) > _IPYTHON_HTML_SIZE_LIMIT * 0.5: 1116 warnings.warn('mediapy: subsampling images to reduce HTML size') 1117 list_images = [image[::2, ::2] for image in list_images] 1118 png_datas = [compress_image(to_uint8(image)) for image in list_images] 1119 s = html_from_compressed_images() 1120 if return_html: 1121 return s 1122 _display_html(s) 1123 return None 1124 1125 1126def compare_images( 1127 images: Iterable[_ArrayLike], 1128 *, 1129 vmin: float | None = None, 1130 vmax: float | None = None, 1131 cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray', 1132) -> None: 1133 """Compare two images using an interactive slider. 1134 1135 Displays an HTML slider component to interactively swipe between two images. 1136 The slider functionality requires that the web browser have Internet access. 1137 See additional info in `https://github.com/sneas/img-comparison-slider`. 1138 1139 >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64)) 1140 >>> compare_images([image1, image2]) 1141 1142 Args: 1143 images: Iterable of images. Each image must be either a 2D array or a 3D 1144 array with 1, 3, or 4 channels. There must be exactly two images. 1145 vmin: For single-channel image, explicit min value for display. 1146 vmax: For single-channel image, explicit max value for display. 1147 cmap: For single-channel image, `pyplot` color map or callable to map 1D to 1148 3D color. 1149 """ 1150 list_images = [ 1151 _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap) 1152 for image in images 1153 ] 1154 if len(list_images) != 2: 1155 raise ValueError('The number of images must be 2.') 1156 png_datas = [compress_image(to_uint8(image)) for image in list_images] 1157 b64_1, b64_2 = [ 1158 base64.b64encode(png_data).decode('utf-8') for png_data in png_datas 1159 ] 1160 s = _IMAGE_COMPARISON_HTML.replace('{b64_1}', b64_1).replace('{b64_2}', b64_2) 1161 _display_html(s) 1162 1163 1164# ** Video I/O. 1165 1166 1167def _filename_suffix_from_codec(codec: str) -> str: 1168 if codec == 'gif': 1169 return '.gif' 1170 if codec == 'vp9': 1171 return '.webm' 1172 1173 return '.mp4' 1174 1175 1176def _get_ffmpeg_path() -> str: 1177 path = _search_for_ffmpeg_path() 1178 if not path: 1179 raise RuntimeError( 1180 f"Program '{_config.ffmpeg_name_or_path}' is not found;" 1181 " perhaps install ffmpeg using 'apt install ffmpeg'." 1182 ) 1183 return path 1184 1185 1186@typing.overload 1187def _run_ffmpeg( 1188 ffmpeg_args: Sequence[str], 1189 stdin: int | None = None, 1190 stdout: int | None = None, 1191 stderr: int | None = None, 1192 encoding: None = None, # No encoding -> bytes 1193 allowed_input_files: Sequence[str] | None = None, 1194 allowed_output_files: Sequence[str] | None = None, 1195) -> subprocess.Popen[bytes]: 1196 ... 1197 1198 1199@typing.overload 1200def _run_ffmpeg( 1201 ffmpeg_args: Sequence[str], 1202 stdin: int | None = None, 1203 stdout: int | None = None, 1204 stderr: int | None = None, 1205 encoding: str = ..., # Encoding -> str 1206 allowed_input_files: Sequence[str] | None = None, 1207 allowed_output_files: Sequence[str] | None = None, 1208) -> subprocess.Popen[str]: 1209 ... 1210 1211 1212def _run_ffmpeg( 1213 ffmpeg_args: Sequence[str], 1214 stdin: int | None = None, 1215 stdout: int | None = None, 1216 stderr: int | None = None, 1217 encoding: str | None = None, 1218 allowed_input_files: Sequence[str] | None = None, 1219 allowed_output_files: Sequence[str] | None = None, 1220) -> subprocess.Popen[bytes] | subprocess.Popen[str]: 1221 """Runs ffmpeg with the given args. 1222 1223 Args: 1224 ffmpeg_args: The args to pass to ffmpeg. 1225 stdin: Same as in `subprocess.Popen`. 1226 stdout: Same as in `subprocess.Popen`. 1227 stderr: Same as in `subprocess.Popen`. 1228 encoding: Same as in `subprocess.Popen`. 1229 allowed_input_files: The input files to allow for ffmpeg. 1230 allowed_output_files: The output files to allow for ffmpeg. 1231 1232 Returns: 1233 The subprocess.Popen object with running ffmpeg process. 1234 """ 1235 argv = [] 1236 env: Any = {} 1237 ffmpeg_path = _get_ffmpeg_path() 1238 1239 # Allowed input and output files are not supported in open source. 1240 del allowed_input_files 1241 del allowed_output_files 1242 1243 argv.append(ffmpeg_path) 1244 argv.extend(ffmpeg_args) 1245 1246 return subprocess.Popen( 1247 argv, 1248 stdin=stdin, 1249 stdout=stdout, 1250 stderr=stderr, 1251 encoding=encoding, 1252 env=env, 1253 ) 1254 1255 1256def video_is_available() -> bool: 1257 """Returns True if the program `ffmpeg` is found. 1258 1259 See also `set_ffmpeg`. 1260 """ 1261 return _search_for_ffmpeg_path() is not None 1262 1263 1264class VideoMetadata(NamedTuple): 1265 """Represents the data stored in a video container header. 1266 1267 Attributes: 1268 num_images: Number of frames that is expected from the video stream. This 1269 is estimated from the framerate and the duration stored in the video 1270 header, so it might be inexact. We set the value to -1 if number of 1271 frames is not found in the header. 1272 shape: The dimensions (height, width) of each video frame. 1273 fps: The framerate in frames per second. 1274 bps: The estimated bitrate of the video stream in bits per second, retrieved 1275 from the video header. 1276 """ 1277 1278 num_images: int 1279 shape: tuple[int, int] 1280 fps: float 1281 bps: int | None 1282 1283 1284def _get_video_metadata(path: _Path) -> VideoMetadata: 1285 """Returns attributes of video stored in the specified local file.""" 1286 if not pathlib.Path(path).is_file(): 1287 raise RuntimeError(f"Video file '{path}' is not found.") 1288 1289 command = [ 1290 '-nostdin', 1291 '-i', 1292 str(path), 1293 '-acodec', 1294 'copy', 1295 # Necessary to get "frame= *(\d+)" using newer ffmpeg versions. 1296 # Previously, was `'-vcodec', 'copy'` 1297 '-vf', 1298 'select=1', 1299 '-vsync', 1300 '0', 1301 '-f', 1302 'null', 1303 '-', 1304 ] 1305 with _run_ffmpeg( 1306 command, 1307 allowed_input_files=[str(path)], 1308 stderr=subprocess.PIPE, 1309 encoding='utf-8', 1310 ) as proc: 1311 _, err = proc.communicate() 1312 bps = fps = num_images = width = height = rotation = None 1313 before_output_info = True 1314 for line in err.split('\n'): 1315 if line.startswith('Output '): 1316 before_output_info = False 1317 if match := re.search(r', bitrate: *([\d.]+) kb/s', line): 1318 bps = int(match.group(1)) * 1000 1319 if matches := re.findall(r'frame= *(\d+) ', line): 1320 num_images = int(matches[-1]) 1321 if 'Stream #0:' in line and ': Video:' in line and before_output_info: 1322 if not (match := re.search(r', (\d+)x(\d+)', line)): 1323 raise RuntimeError(f'Unable to parse video dimensions in line {line}') 1324 width, height = int(match.group(1)), int(match.group(2)) 1325 if match := re.search(r', ([\d.]+) fps', line): 1326 fps = float(match.group(1)) 1327 elif str(path).endswith('.gif'): 1328 # Some GIF files lack a framerate attribute; use a reasonable default. 1329 fps = 10 1330 else: 1331 raise RuntimeError(f'Unable to parse video framerate in line {line}') 1332 if match := re.fullmatch(r'\s*rotate\s*:\s*(\d+)', line): 1333 rotation = int(match.group(1)) 1334 if match := re.fullmatch(r'.*rotation of (-?\d+).*\sdegrees', line): 1335 rotation = int(match.group(1)) 1336 if not num_images: 1337 num_images = -1 1338 if not width: 1339 raise RuntimeError(f'Unable to parse video header: {err}') 1340 # By default, ffmpeg enables "-autorotate"; we just fix the dimensions. 1341 if rotation in (90, 270, -90, -270): 1342 width, height = height, width 1343 assert height is not None and width is not None 1344 shape = height, width 1345 assert fps is not None 1346 return VideoMetadata(num_images, shape, fps, bps) 1347 1348 1349class _VideoIO: 1350 """Base class for `VideoReader` and `VideoWriter`.""" 1351 1352 def _get_pix_fmt(self, dtype: _DType, image_format: str) -> str: 1353 """Returns ffmpeg pix_fmt given data type and image format.""" 1354 native_endian_suffix = {'little': 'le', 'big': 'be'}[sys.byteorder] 1355 return { 1356 np.uint8: { 1357 'rgb': 'rgb24', 1358 'yuv': 'yuv444p', 1359 'gray': 'gray', 1360 }, 1361 np.uint16: { 1362 'rgb': 'rgb48' + native_endian_suffix, 1363 'yuv': 'yuv444p16' + native_endian_suffix, 1364 'gray': 'gray16' + native_endian_suffix, 1365 }, 1366 }[dtype.type][image_format] 1367 1368 1369class VideoReader(_VideoIO): 1370 """Context to read a compressed video as an iterable over its images. 1371 1372 >>> with VideoReader('/tmp/river.mp4') as reader: 1373 ... print(f'Video has {reader.num_images} images with shape={reader.shape},' 1374 ... f' at {reader.fps} frames/sec and {reader.bps} bits/sec.') 1375 ... for image in reader: 1376 ... print(image.shape) 1377 1378 >>> with VideoReader('/tmp/river.mp4') as reader: 1379 ... video = np.array(tuple(reader)) 1380 1381 >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4' 1382 >>> with VideoReader(url) as reader: 1383 ... show_video(reader) 1384 1385 Attributes: 1386 path_or_url: Location of input video. 1387 output_format: Format of output images (default 'rgb'). If 'rgb', each 1388 image has shape=(height, width, 3) with R, G, B values. If 'yuv', each 1389 image has shape=(height, width, 3) with Y, U, V values. If 'gray', each 1390 image has shape=(height, width). 1391 dtype: Data type for output images. The default is `np.uint8`. Use of 1392 `np.uint16` allows reading 10-bit or 12-bit data without precision loss. 1393 metadata: Object storing the information retrieved from the video header. 1394 Its attributes are copied as attributes in this class. 1395 num_images: Number of frames that is expected from the video stream. This 1396 is estimated from the framerate and the duration stored in the video 1397 header, so it might be inexact. 1398 shape: The dimensions (height, width) of each video frame. 1399 fps: The framerate in frames per second. 1400 bps: The estimated bitrate of the video stream in bits per second, retrieved 1401 from the video header. 1402 stream_index: The stream index to read from. The default is 0. 1403 """ 1404 1405 path_or_url: _Path 1406 output_format: str 1407 dtype: _DType 1408 metadata: VideoMetadata 1409 num_images: int 1410 shape: tuple[int, int] 1411 fps: float 1412 bps: int | None 1413 stream_index: int 1414 _num_bytes_per_image: int 1415 1416 def __init__( 1417 self, 1418 path_or_url: _Path, 1419 *, 1420 stream_index: int = 0, 1421 output_format: str = 'rgb', 1422 dtype: _DTypeLike = np.uint8, 1423 ): 1424 if output_format not in {'rgb', 'yuv', 'gray'}: 1425 raise ValueError( 1426 f'Output format {output_format} is not rgb, yuv, or gray.' 1427 ) 1428 self.path_or_url = path_or_url 1429 self.output_format = output_format 1430 self.stream_index = stream_index 1431 self.dtype = np.dtype(dtype) 1432 if self.dtype.type not in (np.uint8, np.uint16): 1433 raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.') 1434 self._read_via_local_file: Any = None 1435 self._popen: subprocess.Popen[bytes] | None = None 1436 self._proc: subprocess.Popen[bytes] | None = None 1437 1438 def __enter__(self) -> 'VideoReader': 1439 try: 1440 self._read_via_local_file = _read_via_local_file(self.path_or_url) 1441 # pylint: disable-next=no-member 1442 tmp_name = self._read_via_local_file.__enter__() 1443 1444 self.metadata = _get_video_metadata(tmp_name) 1445 self.num_images, self.shape, self.fps, self.bps = self.metadata 1446 pix_fmt = self._get_pix_fmt(self.dtype, self.output_format) 1447 num_channels = {'rgb': 3, 'yuv': 3, 'gray': 1}[self.output_format] 1448 bytes_per_channel = self.dtype.itemsize 1449 self._num_bytes_per_image = ( 1450 math.prod(self.shape) * num_channels * bytes_per_channel 1451 ) 1452 1453 command = [ 1454 '-v', 1455 'panic', 1456 '-nostdin', 1457 '-i', 1458 tmp_name, 1459 '-vcodec', 1460 'rawvideo', 1461 '-f', 1462 'image2pipe', 1463 '-map', 1464 f'0:v:{self.stream_index}', 1465 '-pix_fmt', 1466 pix_fmt, 1467 '-vsync', 1468 'vfr', 1469 '-', 1470 ] 1471 self._popen = _run_ffmpeg( 1472 command, 1473 stdout=subprocess.PIPE, 1474 stderr=subprocess.PIPE, 1475 allowed_input_files=[tmp_name], 1476 ) 1477 self._proc = self._popen.__enter__() 1478 except Exception: 1479 self.__exit__(None, None, None) 1480 raise 1481 return self 1482 1483 def __exit__(self, *_: Any) -> None: 1484 self.close() 1485 1486 def read(self) -> _NDArray | None: 1487 """Reads a video image frame (or None if at end of file). 1488 1489 Returns: 1490 A numpy array in the format specified by `output_format`, i.e., a 3D 1491 array with 3 color channels, except for format 'gray' which is 2D. 1492 """ 1493 assert self._proc, 'Error: reading from an already closed context.' 1494 stdout = self._proc.stdout 1495 assert stdout is not None 1496 data = stdout.read(self._num_bytes_per_image) 1497 if not data: # Due to either end-of-file or subprocess error. 1498 self.close() # Raises exception if subprocess had error. 1499 return None # To indicate end-of-file. 1500 assert len(data) == self._num_bytes_per_image 1501 image = np.frombuffer(data, dtype=self.dtype) 1502 if self.output_format == 'rgb': 1503 image = image.reshape(*self.shape, 3) 1504 elif self.output_format == 'yuv': # Convert from planar YUV to pixel YUV. 1505 image = np.moveaxis(image.reshape(3, *self.shape), 0, 2) 1506 elif self.output_format == 'gray': # Generate 2D rather than 3D ndimage. 1507 image = image.reshape(*self.shape) 1508 else: 1509 raise AssertionError 1510 return image 1511 1512 def __iter__(self) -> Iterator[_NDArray]: 1513 while True: 1514 image = self.read() 1515 if image is None: 1516 return 1517 yield image 1518 1519 def close(self) -> None: 1520 """Terminates video reader. (Called automatically at end of context.)""" 1521 if self._popen: 1522 self._popen.__exit__(None, None, None) 1523 self._popen = None 1524 self._proc = None 1525 if self._read_via_local_file: 1526 # pylint: disable-next=no-member 1527 self._read_via_local_file.__exit__(None, None, None) 1528 self._read_via_local_file = None 1529 1530 1531class VideoWriter(_VideoIO): 1532 """Context to write a compressed video. 1533 1534 >>> shape = 480, 640 1535 >>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer: 1536 ... for image in moving_circle(shape, num_images=60): 1537 ... writer.add_image(image) 1538 >>> show_video(read_video('/tmp/v.mp4')) 1539 1540 1541 Bitrate control may be specified using at most one of: `bps`, `qp`, or `crf`. 1542 If none are specified, `qp` is set to a default value. 1543 See https://slhck.info/video/2017/03/01/rate-control.html 1544 1545 If codec is 'gif', the args `bps`, `qp`, `crf`, and `encoded_format` are 1546 ignored. 1547 1548 Attributes: 1549 path: Output video. Its suffix (e.g. '.mp4') determines the video container 1550 format. The suffix must be '.gif' if the codec is 'gif'. 1551 shape: 2D spatial dimensions (height, width) of video image frames. The 1552 dimensions must be even if 'encoded_format' has subsampled chroma (e.g., 1553 'yuv420p' or 'yuv420p10le'). 1554 codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264', 1555 'hevc', 'vp9', or 'gif'). 1556 metadata: Optional VideoMetadata object whose `fps` and `bps` attributes are 1557 used if not specified as explicit parameters. 1558 fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif'). 1559 bps: Requested average bits-per-second bitrate (default None). 1560 qp: Quantization parameter for video compression quality (default None). 1561 crf: Constant rate factor for video compression quality (default None). 1562 ffmpeg_args: Additional arguments for `ffmpeg` command, e.g. '-g 30' to 1563 introduce I-frames, or '-bf 0' to omit B-frames. 1564 input_format: Format of input images (default 'rgb'). If 'rgb', each image 1565 has shape=(height, width, 3) or (height, width). If 'yuv', each image has 1566 shape=(height, width, 3) with Y, U, V values. If 'gray', each image has 1567 shape=(height, width). 1568 dtype: Expected data type for input images (any float input images are 1569 converted to `dtype`). The default is `np.uint8`. Use of `np.uint16` is 1570 necessary when encoding >8 bits/channel. 1571 encoded_format: Pixel format as defined by `ffmpeg -pix_fmts`, e.g., 1572 'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma), 1573 'yuv420p10le' (10-bit per channel), etc. The default (None) selects 1574 'yuv420p' if all shape dimensions are even, else 'yuv444p'. 1575 """ 1576 1577 def __init__( 1578 self, 1579 path: _Path, 1580 shape: tuple[int, int], 1581 *, 1582 codec: str = 'h264', 1583 metadata: VideoMetadata | None = None, 1584 fps: float | None = None, 1585 bps: int | None = None, 1586 qp: int | None = None, 1587 crf: float | None = None, 1588 ffmpeg_args: str | Sequence[str] = '', 1589 input_format: str = 'rgb', 1590 dtype: _DTypeLike = np.uint8, 1591 encoded_format: str | None = None, 1592 ) -> None: 1593 _check_2d_shape(shape) 1594 if fps is None and metadata: 1595 fps = metadata.fps 1596 if fps is None: 1597 fps = 25.0 if codec == 'gif' else 60.0 1598 if fps <= 0.0: 1599 raise ValueError(f'Frame-per-second value {fps} is invalid.') 1600 if bps is None and metadata: 1601 bps = metadata.bps 1602 bps = int(bps) if bps is not None else None 1603 if bps is not None and bps <= 0: 1604 raise ValueError(f'Bitrate value {bps} is invalid.') 1605 if qp is not None and (not isinstance(qp, int) or qp < 0): 1606 raise ValueError( 1607 f'Quantization parameter {qp} cannot be negative. It must be a' 1608 ' non-negative integer.' 1609 ) 1610 num_rate_specifications = sum(x is not None for x in (bps, qp, crf)) 1611 if num_rate_specifications > 1: 1612 raise ValueError( 1613 f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).' 1614 ) 1615 ffmpeg_args = ( 1616 shlex.split(ffmpeg_args) 1617 if isinstance(ffmpeg_args, str) 1618 else list(ffmpeg_args) 1619 ) 1620 if input_format not in {'rgb', 'yuv', 'gray'}: 1621 raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.') 1622 dtype = np.dtype(dtype) 1623 if dtype.type not in (np.uint8, np.uint16): 1624 raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.') 1625 self.path = pathlib.Path(path) 1626 self.shape = shape 1627 all_dimensions_are_even = all(dim % 2 == 0 for dim in shape) 1628 if encoded_format is None: 1629 encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p' 1630 if not all_dimensions_are_even and encoded_format.startswith( 1631 ('yuv42', 'yuvj42') 1632 ): 1633 raise ValueError( 1634 f'With encoded_format {encoded_format}, video dimensions must be' 1635 f' even, but shape is {shape}.' 1636 ) 1637 self.fps = fps 1638 self.codec = codec 1639 self.bps = bps 1640 self.qp = qp 1641 self.crf = crf 1642 self.ffmpeg_args = ffmpeg_args 1643 self.input_format = input_format 1644 self.dtype = dtype 1645 self.encoded_format = encoded_format 1646 if num_rate_specifications == 0 and not ffmpeg_args: 1647 qp = 20 if math.prod(self.shape) <= 640 * 480 else 28 1648 self._bitrate_args = ( 1649 (['-vb', f'{bps}'] if bps is not None else []) 1650 + (['-qp', f'{qp}'] if qp is not None else []) 1651 + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else []) 1652 ) 1653 if self.codec == 'gif': 1654 if self.path.suffix != '.gif': 1655 raise ValueError(f"File '{self.path}' does not have a .gif suffix.") 1656 self.encoded_format = 'pal8' 1657 self._bitrate_args = [] 1658 video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse' 1659 # Less common (and likely less useful) is a per-frame color palette: 1660 # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];' 1661 # '[s1][p]paletteuse=new=1') 1662 self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args 1663 self._write_via_local_file: Any = None 1664 self._popen: subprocess.Popen[bytes] | None = None 1665 self._proc: subprocess.Popen[bytes] | None = None 1666 1667 def __enter__(self) -> 'VideoWriter': 1668 input_pix_fmt = self._get_pix_fmt(self.dtype, self.input_format) 1669 try: 1670 self._write_via_local_file = _write_via_local_file(self.path) 1671 # pylint: disable-next=no-member 1672 tmp_name = self._write_via_local_file.__enter__() 1673 1674 # Writing to stdout using ('-f', 'mp4', '-') would require 1675 # ('-movflags', 'frag_keyframe+empty_moov') and the result is nonportable. 1676 height, width = self.shape 1677 command = ( 1678 [ 1679 '-v', 1680 'error', 1681 '-f', 1682 'rawvideo', 1683 '-vcodec', 1684 'rawvideo', 1685 '-pix_fmt', 1686 input_pix_fmt, 1687 '-s', 1688 f'{width}x{height}', 1689 '-r', 1690 f'{self.fps}', 1691 '-i', 1692 '-', 1693 '-an', 1694 '-vcodec', 1695 self.codec, 1696 '-pix_fmt', 1697 self.encoded_format, 1698 ] 1699 + self._bitrate_args 1700 + self.ffmpeg_args 1701 + ['-y', tmp_name] 1702 ) 1703 self._popen = _run_ffmpeg( 1704 command, 1705 stdin=subprocess.PIPE, 1706 stderr=subprocess.PIPE, 1707 allowed_output_files=[tmp_name], 1708 ) 1709 self._proc = self._popen.__enter__() 1710 except Exception: 1711 self.__exit__(None, None, None) 1712 raise 1713 return self 1714 1715 def __exit__(self, *_: Any) -> None: 1716 self.close() 1717 1718 def add_image(self, image: _NDArray) -> None: 1719 """Writes a video frame. 1720 1721 Args: 1722 image: Array whose dtype and first two dimensions must match the `dtype` 1723 and `shape` specified in `VideoWriter` initialization. If 1724 `input_format` is 'gray', the image must be 2D. For the 'rgb' 1725 input_format, the image may be either 2D (interpreted as grayscale) or 1726 3D with three (R, G, B) channels. For the 'yuv' input_format, the image 1727 must be 3D with three (Y, U, V) channels. 1728 1729 Raises: 1730 RuntimeError: If there is an error writing to the output file. 1731 """ 1732 assert self._proc, 'Error: writing to an already closed context.' 1733 if issubclass(image.dtype.type, (np.floating, np.bool_)): 1734 image = to_type(image, self.dtype) 1735 if image.dtype != self.dtype: 1736 raise ValueError(f'Image type {image.dtype} != {self.dtype}.') 1737 if self.input_format == 'gray': 1738 if image.ndim != 2: 1739 raise ValueError(f'Image dimensions {image.shape} are not 2D.') 1740 else: 1741 if image.ndim == 2 and self.input_format == 'rgb': 1742 image = np.dstack((image, image, image)) 1743 if not (image.ndim == 3 and image.shape[2] == 3): 1744 raise ValueError(f'Image dimensions {image.shape} are invalid.') 1745 if image.shape[:2] != self.shape: 1746 raise ValueError( 1747 f'Image dimensions {image.shape[:2]} do not match' 1748 f' those of the initialized video {self.shape}.' 1749 ) 1750 if self.input_format == 'yuv': # Convert from per-pixel YUV to planar YUV. 1751 image = np.moveaxis(image, 2, 0) 1752 data = image.tobytes() 1753 stdin = self._proc.stdin 1754 assert stdin is not None 1755 if stdin.write(data) != len(data): 1756 self._proc.wait() 1757 stderr = self._proc.stderr 1758 assert stderr is not None 1759 s = stderr.read().decode('utf-8') 1760 raise RuntimeError(f"Error writing '{self.path}': {s}") 1761 1762 def close(self) -> None: 1763 """Finishes writing the video. (Called automatically at end of context.)""" 1764 if self._popen: 1765 assert self._proc, 'Error: closing an already closed context.' 1766 stdin = self._proc.stdin 1767 assert stdin is not None 1768 stdin.close() 1769 if self._proc.wait(): 1770 stderr = self._proc.stderr 1771 assert stderr is not None 1772 s = stderr.read().decode('utf-8') 1773 raise RuntimeError(f"Error writing '{self.path}': {s}") 1774 self._popen.__exit__(None, None, None) 1775 self._popen = None 1776 self._proc = None 1777 if self._write_via_local_file: 1778 # pylint: disable-next=no-member 1779 self._write_via_local_file.__exit__(None, None, None) 1780 self._write_via_local_file = None 1781 1782 1783class _VideoArray(npt.NDArray[Any]): 1784 """Wrapper to add a VideoMetadata `metadata` attribute to a numpy array.""" 1785 1786 metadata: VideoMetadata | None 1787 1788 def __new__( 1789 cls: Type['_VideoArray'], 1790 input_array: _NDArray, 1791 metadata: VideoMetadata | None = None, 1792 ) -> '_VideoArray': 1793 obj: _VideoArray = np.asarray(input_array).view(cls) 1794 obj.metadata = metadata 1795 return obj 1796 1797 def __array_finalize__(self, obj: Any) -> None: 1798 if obj is None: 1799 return 1800 self.metadata = getattr(obj, 'metadata', None) 1801 1802 1803def read_video(path_or_url: _Path, **kwargs: Any) -> _VideoArray: 1804 """Returns an array containing all images read from a compressed video file. 1805 1806 >>> video = read_video('/tmp/river.mp4') 1807 >>> print(f'The framerate is {video.metadata.fps} frames/s.') 1808 >>> show_video(video) 1809 1810 >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4' 1811 >>> show_video(read_video(url)) 1812 1813 Args: 1814 path_or_url: Input video file. 1815 **kwargs: Additional parameters for `VideoReader`. 1816 1817 Returns: 1818 A 4D `numpy` array with dimensions (frame, height, width, channel), or a 3D 1819 array if `output_format` is specified as 'gray'. The returned array has an 1820 attribute `metadata` containing `VideoMetadata` information. This enables 1821 `show_video` to retrieve the framerate in `metadata.fps`. Note that the 1822 metadata attribute is lost in most subsequent `numpy` operations. 1823 """ 1824 with VideoReader(path_or_url, **kwargs) as reader: 1825 return _VideoArray(np.array(tuple(reader)), metadata=reader.metadata) 1826 1827 1828def write_video(path: _Path, images: Iterable[_NDArray], **kwargs: Any) -> None: 1829 """Writes images to a compressed video file. 1830 1831 >>> video = moving_circle((480, 640), num_images=60) 1832 >>> write_video('/tmp/v.mp4', video, fps=60, qp=18) 1833 >>> show_video(read_video('/tmp/v.mp4')) 1834 1835 Args: 1836 path: Output video file. 1837 images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D 1838 arrays. 1839 **kwargs: Additional parameters for `VideoWriter`. 1840 """ 1841 first_image, images = _peek_first(images) 1842 shape = first_image.shape[0], first_image.shape[1] 1843 dtype = first_image.dtype 1844 if dtype == bool: 1845 dtype = np.dtype(np.uint8) 1846 elif np.issubdtype(dtype, np.floating): 1847 dtype = np.dtype(np.uint16) 1848 kwargs = {'metadata': getattr(images, 'metadata', None), **kwargs} 1849 with VideoWriter(path, shape=shape, dtype=dtype, **kwargs) as writer: 1850 for image in images: 1851 writer.add_image(image) 1852 1853 1854def compress_video( 1855 images: Iterable[_NDArray], *, codec: str = 'h264', **kwargs: Any 1856) -> bytes: 1857 """Returns a buffer containing a compressed video. 1858 1859 The video container is 'gif' for 'gif' codec, 'webm' for 'vp9' codec, 1860 and mp4 otherwise. 1861 1862 >>> video = read_video('/tmp/river.mp4') 1863 >>> data = compress_video(video, bps=10_000_000) 1864 >>> print(len(data)) 1865 1866 >>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10) 1867 1868 Args: 1869 images: Iterable over video frames. 1870 codec: Compression algorithm as defined by `ffmpeg -codecs` (e.g., 'h264', 1871 'hevc', 'vp9', or 'gif'). 1872 **kwargs: Additional parameters for `VideoWriter`. 1873 1874 Returns: 1875 A bytes buffer containing the compressed video. 1876 """ 1877 suffix = _filename_suffix_from_codec(codec) 1878 with tempfile.TemporaryDirectory() as directory_name: 1879 tmp_path = pathlib.Path(directory_name) / f'file{suffix}' 1880 write_video(tmp_path, images, codec=codec, **kwargs) 1881 return tmp_path.read_bytes() 1882 1883 1884def decompress_video(data: bytes, **kwargs: Any) -> _NDArray: 1885 """Returns video images from an MP4-compressed data buffer.""" 1886 with tempfile.TemporaryDirectory() as directory_name: 1887 tmp_path = pathlib.Path(directory_name) / 'file.mp4' 1888 tmp_path.write_bytes(data) 1889 return read_video(tmp_path, **kwargs) 1890 1891 1892def html_from_compressed_video( 1893 data: bytes, 1894 width: int, 1895 height: int, 1896 *, 1897 title: str | None = None, 1898 border: bool | str = False, 1899 loop: bool = True, 1900 autoplay: bool = True, 1901) -> str: 1902 """Returns an HTML string with a video tag containing H264-encoded data. 1903 1904 Args: 1905 data: MP4-compressed video bytes. 1906 width: Width of HTML video in pixels. 1907 height: Height of HTML video in pixels. 1908 title: Optional text shown centered above the video. 1909 border: If `bool`, whether to place a black boundary around the image, or if 1910 `str`, the boundary CSS style. 1911 loop: If True, the playback repeats forever. 1912 autoplay: If True, video playback starts without having to click. 1913 """ 1914 b64 = base64.b64encode(data).decode('utf-8') 1915 if isinstance(border, str): 1916 border = f'{border}; ' 1917 elif border: 1918 border = 'border:1px solid black; ' 1919 else: 1920 border = '' 1921 options = ( 1922 f'controls width="{width}" height="{height}"' 1923 f' style="{border}object-fit:cover;"' 1924 f'{" loop" if loop else ""}' 1925 f'{" autoplay muted" if autoplay else ""}' 1926 ) 1927 s = f"""<video {options}> 1928 <source src="data:video/mp4;base64,{b64}" type="video/mp4"/> 1929 This browser does not support the video tag. 1930 </video>""" 1931 if title is not None: 1932 s = f"""<div style="display:flex; align-items:left;"> 1933 <div style="display:flex; flex-direction:column; align-items:center;"> 1934 <div>{title}</div><div>{s}</div></div></div>""" 1935 return s 1936 1937 1938def show_video( 1939 images: Iterable[_NDArray], *, title: str | None = None, **kwargs: Any 1940) -> str | None: 1941 """Displays a video in the IPython notebook and optionally saves it to a file. 1942 1943 See `show_videos`. 1944 1945 >>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4') 1946 >>> show_video(video, title='River video') 1947 1948 >>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True) 1949 1950 >>> show_video(read_video('/tmp/river.mp4')) 1951 1952 Args: 1953 images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D 1954 arrays). 1955 title: Optional text shown centered above the video. 1956 **kwargs: See `show_videos`. 1957 1958 Returns: 1959 html string if `return_html` is `True`. 1960 """ 1961 return show_videos([images], [title], **kwargs) 1962 1963 1964def show_videos( 1965 videos: Iterable[Iterable[_NDArray]] | Mapping[str, Iterable[_NDArray]], 1966 titles: Iterable[str | None] | None = None, 1967 *, 1968 width: int | None = None, 1969 height: int | None = None, 1970 downsample: bool = True, 1971 columns: int | None = None, 1972 fps: float | None = None, 1973 bps: int | None = None, 1974 qp: int | None = None, 1975 codec: str = 'h264', 1976 ylabel: str = '', 1977 html_class: str = 'show_videos', 1978 return_html: bool = False, 1979 **kwargs: Any, 1980) -> str | None: 1981 """Displays a row of videos in the IPython notebook. 1982 1983 Creates HTML with `<video>` tags containing embedded H264-encoded bytestrings. 1984 If `codec` is set to 'gif', we instead use `<img>` tags containing embedded 1985 GIF-encoded bytestrings. Note that the resulting GIF animations skip frames 1986 when the `fps` period is not a multiple of 10 ms units (GIF frame delay 1987 units). Encoding at `fps` = 20.0, 25.0, or 50.0 works fine. 1988 1989 If a directory has been specified using `set_show_save_dir`, also saves each 1990 titled video to a file in that directory based on its title. 1991 1992 Args: 1993 videos: Iterable of videos, or dictionary of `{title: video}`. Each video 1994 must be an iterable of images. If a video object has a `metadata` 1995 (`VideoMetadata`) attribute, its `fps` field provides a default framerate. 1996 titles: Optional strings shown above the corresponding videos. 1997 width: Optional, overrides displayed width (in pixels). 1998 height: Optional, overrides displayed height (in pixels). 1999 downsample: If True, each video whose width or height is greater than the 2000 specified `width` or `height` is resampled to the display resolution. This 2001 improves antialiasing and reduces the size of the notebook. 2002 columns: Optional, maximum number of videos per row. 2003 fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF). 2004 bps: Bits-per-second bitrate (default None). 2005 qp: Quantization parameter for video compression quality (default None). 2006 codec: Compression algorithm; must be either 'h264' or 'gif'. 2007 ylabel: Text (rotated by 90 degrees) shown on the left of each row. 2008 html_class: CSS class name used in definition of HTML element. 2009 return_html: If `True` return the raw HTML `str` instead of displaying. 2010 **kwargs: Additional parameters (`border`, `loop`, `autoplay`) for 2011 `html_from_compressed_video`. 2012 2013 Returns: 2014 html string if `return_html` is `True`. 2015 """ 2016 if isinstance(videos, Mapping): 2017 if titles is not None: 2018 raise ValueError( 2019 'Cannot have both a video dictionary and a titles parameter.' 2020 ) 2021 list_titles = list(videos.keys()) 2022 list_videos = list(videos.values()) 2023 else: 2024 list_videos = list(cast('Iterable[_NDArray]', videos)) 2025 list_titles = [None] * len(list_videos) if titles is None else list(titles) 2026 if len(list_videos) != len(list_titles): 2027 raise ValueError( 2028 'Number of videos does not match number of titles' 2029 f' ({len(list_videos)} vs {len(list_titles)}).' 2030 ) 2031 if codec not in {'h264', 'gif'}: 2032 raise ValueError(f'Codec {codec} is neither h264 or gif.') 2033 2034 html_strings = [] 2035 for video, title in zip(list_videos, list_titles): 2036 metadata: VideoMetadata | None = getattr(video, 'metadata', None) 2037 first_image, video = _peek_first(video) 2038 w, h = _get_width_height(width, height, first_image.shape[:2]) 2039 if downsample and (w < first_image.shape[1] or h < first_image.shape[0]): 2040 # Not resize_video() because each image may have different depth and type. 2041 video = [resize_image(image, (h, w)) for image in video] 2042 first_image = video[0] 2043 data = compress_video( 2044 video, metadata=metadata, fps=fps, bps=bps, qp=qp, codec=codec 2045 ) 2046 if title is not None and _config.show_save_dir: 2047 suffix = _filename_suffix_from_codec(codec) 2048 path = pathlib.Path(_config.show_save_dir) / f'{title}{suffix}' 2049 with _open(path, mode='wb') as f: 2050 f.write(data) 2051 if codec == 'gif': 2052 pixelated = h > first_image.shape[0] or w > first_image.shape[1] 2053 html_string = html_from_compressed_image( 2054 data, w, h, title=title, fmt='gif', pixelated=pixelated, **kwargs 2055 ) 2056 else: 2057 html_string = html_from_compressed_video( 2058 data, w, h, title=title, **kwargs 2059 ) 2060 html_strings.append(html_string) 2061 2062 # Create single-row tables each with no more than 'columns' elements. 2063 table_strings = [] 2064 for row_html_strings in _chunked(html_strings, columns): 2065 td = '<td style="padding:1px;">' 2066 s = ''.join(f'{td}{e}</td>' for e in row_html_strings) 2067 if ylabel: 2068 style = 'writing-mode:vertical-lr; transform:rotate(180deg);' 2069 s = f'{td}<span style="{style}">{ylabel}</span></td>' + s 2070 table_strings.append( 2071 f'<table class="{html_class}"' 2072 f' style="border-spacing:0px;"><tr>{s}</tr></table>' 2073 ) 2074 s = ''.join(table_strings) 2075 if return_html: 2076 return s 2077 _display_html(s) 2078 return None 2079 2080 2081# Local Variables: 2082# fill-column: 80 2083# End:
977def show_image( 978 image: _ArrayLike, *, title: str | None = None, **kwargs: Any 979) -> str | None: 980 """Displays an image in the notebook and optionally saves it to a file. 981 982 See `show_images`. 983 984 >>> show_image(np.random.rand(100, 100)) 985 >>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8')) 986 >>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100) 987 >>> show_image(read_image('/tmp/image.png')) 988 >>> url = 'https://github.com/hhoppe/data/raw/main/image.png' 989 >>> show_image(read_image(url)) 990 991 Args: 992 image: 2D array-like, or 3D array-like with 1, 3, or 4 channels. 993 title: Optional text shown centered above the image. 994 **kwargs: See `show_images`. 995 996 Returns: 997 html string if `return_html` is `True`. 998 """ 999 return show_images([np.asarray(image)], [title], **kwargs)
Displays an image in the notebook and optionally saves it to a file.
See show_images.
>>> show_image(np.random.rand(100, 100))
>>> show_image(np.random.randint(0, 256, size=(80, 80, 3), dtype='uint8'))
>>> show_image(np.random.rand(10, 10) - 0.5, cmap='bwr', height=100)
>>> show_image(read_image('/tmp/image.png'))
>>> url = 'https://github.com/hhoppe/data/raw/main/image.png'
>>> show_image(read_image(url))
Arguments:
- image: 2D array-like, or 3D array-like with 1, 3, or 4 channels.
- title: Optional text shown centered above the image.
- **kwargs: See
show_images.
Returns:
html string if
return_htmlisTrue.
1002def show_images( 1003 images: Iterable[_ArrayLike] | Mapping[str, _ArrayLike], 1004 titles: Iterable[str | None] | None = None, 1005 *, 1006 width: int | None = None, 1007 height: int | None = None, 1008 downsample: bool = True, 1009 columns: int | None = None, 1010 vmin: float | None = None, 1011 vmax: float | None = None, 1012 cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray', 1013 border: bool | str = False, 1014 ylabel: str = '', 1015 html_class: str = 'show_images', 1016 pixelated: bool | None = None, 1017 return_html: bool = False, 1018) -> str | None: 1019 """Displays a row of images in the IPython/Jupyter notebook. 1020 1021 If a directory has been specified using `set_show_save_dir`, also saves each 1022 titled image to a file in that directory based on its title. 1023 1024 >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64)) 1025 >>> show_images([image1, image2]) 1026 >>> show_images({'random image': image1, 'color ramp': image2}, height=128) 1027 >>> show_images([image1, image2] * 5, columns=4, border=True) 1028 1029 Args: 1030 images: Iterable of images, or dictionary of `{title: image}`. Each image 1031 must be either a 2D array or a 3D array with 1, 3, or 4 channels. 1032 titles: Optional strings shown above the corresponding images. 1033 width: Optional, overrides displayed width (in pixels). 1034 height: Optional, overrides displayed height (in pixels). 1035 downsample: If True, each image whose width or height is greater than the 1036 specified `width` or `height` is resampled to the display resolution. This 1037 improves antialiasing and reduces the size of the notebook. 1038 columns: Optional, maximum number of images per row. 1039 vmin: For single-channel image, explicit min value for display. 1040 vmax: For single-channel image, explicit max value for display. 1041 cmap: For single-channel image, `pyplot` color map or callable to map 1D to 1042 3D color. 1043 border: If `bool`, whether to place a black boundary around the image, or if 1044 `str`, the boundary CSS style. 1045 ylabel: Text (rotated by 90 degrees) shown on the left of each row. 1046 html_class: CSS class name used in definition of HTML element. 1047 pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if 1048 False, sets 'image-rendering: auto'; if None, uses pixelated rendering 1049 only on images for which `width` or `height` introduces magnification. 1050 return_html: If `True` return the raw HTML `str` instead of displaying. 1051 1052 Returns: 1053 html string if `return_html` is `True`. 1054 """ 1055 if isinstance(images, Mapping): 1056 if titles is not None: 1057 raise ValueError('Cannot have images dictionary and titles parameter.') 1058 list_titles, list_images = list(images.keys()), list(images.values()) 1059 else: 1060 list_images = list(images) 1061 list_titles = [None] * len(list_images) if titles is None else list(titles) 1062 if len(list_images) != len(list_titles): 1063 raise ValueError( 1064 'Number of images does not match number of titles' 1065 f' ({len(list_images)} vs {len(list_titles)}).' 1066 ) 1067 1068 list_images = [ 1069 _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap) 1070 for image in list_images 1071 ] 1072 1073 def maybe_downsample(image: _NDArray) -> _NDArray: 1074 shape = image.shape[0], image.shape[1] 1075 w, h = _get_width_height(width, height, shape) 1076 if w < shape[1] or h < shape[0]: 1077 image = resize_image(image, (h, w)) 1078 return image 1079 1080 if downsample: 1081 list_images = [maybe_downsample(image) for image in list_images] 1082 png_datas = [compress_image(to_uint8(image)) for image in list_images] 1083 1084 for title, png_data in zip(list_titles, png_datas): 1085 if title is not None and _config.show_save_dir: 1086 path = pathlib.Path(_config.show_save_dir) / f'{title}.png' 1087 with _open(path, mode='wb') as f: 1088 f.write(png_data) 1089 1090 def html_from_compressed_images() -> str: 1091 html_strings = [] 1092 for image, title, png_data in zip(list_images, list_titles, png_datas): 1093 w, h = _get_width_height(width, height, image.shape[:2]) 1094 magnified = h > image.shape[0] or w > image.shape[1] 1095 pixelated2 = pixelated if pixelated is not None else magnified 1096 html_strings.append( 1097 html_from_compressed_image( 1098 png_data, w, h, title=title, border=border, pixelated=pixelated2 1099 ) 1100 ) 1101 # Create single-row tables each with no more than 'columns' elements. 1102 table_strings = [] 1103 for row_html_strings in _chunked(html_strings, columns): 1104 td = '<td style="padding:1px;">' 1105 s = ''.join(f'{td}{e}</td>' for e in row_html_strings) 1106 if ylabel: 1107 style = 'writing-mode:vertical-lr; transform:rotate(180deg);' 1108 s = f'{td}<span style="{style}">{ylabel}</span></td>' + s 1109 table_strings.append( 1110 f'<table class="{html_class}"' 1111 f' style="border-spacing:0px;"><tr>{s}</tr></table>' 1112 ) 1113 return ''.join(table_strings) 1114 1115 s = html_from_compressed_images() 1116 while len(s) > _IPYTHON_HTML_SIZE_LIMIT * 0.5: 1117 warnings.warn('mediapy: subsampling images to reduce HTML size') 1118 list_images = [image[::2, ::2] for image in list_images] 1119 png_datas = [compress_image(to_uint8(image)) for image in list_images] 1120 s = html_from_compressed_images() 1121 if return_html: 1122 return s 1123 _display_html(s) 1124 return None
Displays a row of images in the IPython/Jupyter notebook.
If a directory has been specified using set_show_save_dir, also saves each
titled image to a file in that directory based on its title.
>>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
>>> show_images([image1, image2])
>>> show_images({'random image': image1, 'color ramp': image2}, height=128)
>>> show_images([image1, image2] * 5, columns=4, border=True)
Arguments:
- images: Iterable of images, or dictionary of
{title: image}. Each image must be either a 2D array or a 3D array with 1, 3, or 4 channels. - titles: Optional strings shown above the corresponding images.
- width: Optional, overrides displayed width (in pixels).
- height: Optional, overrides displayed height (in pixels).
- downsample: If True, each image whose width or height is greater than the
specified
widthorheightis resampled to the display resolution. This improves antialiasing and reduces the size of the notebook. - columns: Optional, maximum number of images per row.
- vmin: For single-channel image, explicit min value for display.
- vmax: For single-channel image, explicit max value for display.
- cmap: For single-channel image,
pyplotcolor map or callable to map 1D to 3D color. - border: If
bool, whether to place a black boundary around the image, or ifstr, the boundary CSS style. - ylabel: Text (rotated by 90 degrees) shown on the left of each row.
- html_class: CSS class name used in definition of HTML element.
- pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'; if
False, sets 'image-rendering: auto'; if None, uses pixelated rendering
only on images for which
widthorheightintroduces magnification. - return_html: If
Truereturn the raw HTMLstrinstead of displaying.
Returns:
html string if
return_htmlisTrue.
1127def compare_images( 1128 images: Iterable[_ArrayLike], 1129 *, 1130 vmin: float | None = None, 1131 vmax: float | None = None, 1132 cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray', 1133) -> None: 1134 """Compare two images using an interactive slider. 1135 1136 Displays an HTML slider component to interactively swipe between two images. 1137 The slider functionality requires that the web browser have Internet access. 1138 See additional info in `https://github.com/sneas/img-comparison-slider`. 1139 1140 >>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64)) 1141 >>> compare_images([image1, image2]) 1142 1143 Args: 1144 images: Iterable of images. Each image must be either a 2D array or a 3D 1145 array with 1, 3, or 4 channels. There must be exactly two images. 1146 vmin: For single-channel image, explicit min value for display. 1147 vmax: For single-channel image, explicit max value for display. 1148 cmap: For single-channel image, `pyplot` color map or callable to map 1D to 1149 3D color. 1150 """ 1151 list_images = [ 1152 _ensure_mapped_to_rgb(image, vmin=vmin, vmax=vmax, cmap=cmap) 1153 for image in images 1154 ] 1155 if len(list_images) != 2: 1156 raise ValueError('The number of images must be 2.') 1157 png_datas = [compress_image(to_uint8(image)) for image in list_images] 1158 b64_1, b64_2 = [ 1159 base64.b64encode(png_data).decode('utf-8') for png_data in png_datas 1160 ] 1161 s = _IMAGE_COMPARISON_HTML.replace('{b64_1}', b64_1).replace('{b64_2}', b64_2) 1162 _display_html(s)
Compare two images using an interactive slider.
Displays an HTML slider component to interactively swipe between two images.
The slider functionality requires that the web browser have Internet access.
See additional info in https://github.com/sneas/img-comparison-slider.
>>> image1, image2 = np.random.rand(64, 64, 3), color_ramp((64, 64))
>>> compare_images([image1, image2])
Arguments:
- images: Iterable of images. Each image must be either a 2D array or a 3D array with 1, 3, or 4 channels. There must be exactly two images.
- vmin: For single-channel image, explicit min value for display.
- vmax: For single-channel image, explicit max value for display.
- cmap: For single-channel image,
pyplotcolor map or callable to map 1D to 3D color.
1939def show_video( 1940 images: Iterable[_NDArray], *, title: str | None = None, **kwargs: Any 1941) -> str | None: 1942 """Displays a video in the IPython notebook and optionally saves it to a file. 1943 1944 See `show_videos`. 1945 1946 >>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4') 1947 >>> show_video(video, title='River video') 1948 1949 >>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True) 1950 1951 >>> show_video(read_video('/tmp/river.mp4')) 1952 1953 Args: 1954 images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D 1955 arrays). 1956 title: Optional text shown centered above the video. 1957 **kwargs: See `show_videos`. 1958 1959 Returns: 1960 html string if `return_html` is `True`. 1961 """ 1962 return show_videos([images], [title], **kwargs)
Displays a video in the IPython notebook and optionally saves it to a file.
See show_videos.
>>> video = read_video('https://github.com/hhoppe/data/raw/main/video.mp4')
>>> show_video(video, title='River video')
>>> show_video(moving_circle((80, 80), num_images=10), fps=5, border=True)
>>> show_video(read_video('/tmp/river.mp4'))
Arguments:
- images: Iterable of video frames (e.g., a 4D array or a list of 2D or 3D arrays).
- title: Optional text shown centered above the video.
- **kwargs: See
show_videos.
Returns:
html string if
return_htmlisTrue.
1965def show_videos( 1966 videos: Iterable[Iterable[_NDArray]] | Mapping[str, Iterable[_NDArray]], 1967 titles: Iterable[str | None] | None = None, 1968 *, 1969 width: int | None = None, 1970 height: int | None = None, 1971 downsample: bool = True, 1972 columns: int | None = None, 1973 fps: float | None = None, 1974 bps: int | None = None, 1975 qp: int | None = None, 1976 codec: str = 'h264', 1977 ylabel: str = '', 1978 html_class: str = 'show_videos', 1979 return_html: bool = False, 1980 **kwargs: Any, 1981) -> str | None: 1982 """Displays a row of videos in the IPython notebook. 1983 1984 Creates HTML with `<video>` tags containing embedded H264-encoded bytestrings. 1985 If `codec` is set to 'gif', we instead use `<img>` tags containing embedded 1986 GIF-encoded bytestrings. Note that the resulting GIF animations skip frames 1987 when the `fps` period is not a multiple of 10 ms units (GIF frame delay 1988 units). Encoding at `fps` = 20.0, 25.0, or 50.0 works fine. 1989 1990 If a directory has been specified using `set_show_save_dir`, also saves each 1991 titled video to a file in that directory based on its title. 1992 1993 Args: 1994 videos: Iterable of videos, or dictionary of `{title: video}`. Each video 1995 must be an iterable of images. If a video object has a `metadata` 1996 (`VideoMetadata`) attribute, its `fps` field provides a default framerate. 1997 titles: Optional strings shown above the corresponding videos. 1998 width: Optional, overrides displayed width (in pixels). 1999 height: Optional, overrides displayed height (in pixels). 2000 downsample: If True, each video whose width or height is greater than the 2001 specified `width` or `height` is resampled to the display resolution. This 2002 improves antialiasing and reduces the size of the notebook. 2003 columns: Optional, maximum number of videos per row. 2004 fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF). 2005 bps: Bits-per-second bitrate (default None). 2006 qp: Quantization parameter for video compression quality (default None). 2007 codec: Compression algorithm; must be either 'h264' or 'gif'. 2008 ylabel: Text (rotated by 90 degrees) shown on the left of each row. 2009 html_class: CSS class name used in definition of HTML element. 2010 return_html: If `True` return the raw HTML `str` instead of displaying. 2011 **kwargs: Additional parameters (`border`, `loop`, `autoplay`) for 2012 `html_from_compressed_video`. 2013 2014 Returns: 2015 html string if `return_html` is `True`. 2016 """ 2017 if isinstance(videos, Mapping): 2018 if titles is not None: 2019 raise ValueError( 2020 'Cannot have both a video dictionary and a titles parameter.' 2021 ) 2022 list_titles = list(videos.keys()) 2023 list_videos = list(videos.values()) 2024 else: 2025 list_videos = list(cast('Iterable[_NDArray]', videos)) 2026 list_titles = [None] * len(list_videos) if titles is None else list(titles) 2027 if len(list_videos) != len(list_titles): 2028 raise ValueError( 2029 'Number of videos does not match number of titles' 2030 f' ({len(list_videos)} vs {len(list_titles)}).' 2031 ) 2032 if codec not in {'h264', 'gif'}: 2033 raise ValueError(f'Codec {codec} is neither h264 or gif.') 2034 2035 html_strings = [] 2036 for video, title in zip(list_videos, list_titles): 2037 metadata: VideoMetadata | None = getattr(video, 'metadata', None) 2038 first_image, video = _peek_first(video) 2039 w, h = _get_width_height(width, height, first_image.shape[:2]) 2040 if downsample and (w < first_image.shape[1] or h < first_image.shape[0]): 2041 # Not resize_video() because each image may have different depth and type. 2042 video = [resize_image(image, (h, w)) for image in video] 2043 first_image = video[0] 2044 data = compress_video( 2045 video, metadata=metadata, fps=fps, bps=bps, qp=qp, codec=codec 2046 ) 2047 if title is not None and _config.show_save_dir: 2048 suffix = _filename_suffix_from_codec(codec) 2049 path = pathlib.Path(_config.show_save_dir) / f'{title}{suffix}' 2050 with _open(path, mode='wb') as f: 2051 f.write(data) 2052 if codec == 'gif': 2053 pixelated = h > first_image.shape[0] or w > first_image.shape[1] 2054 html_string = html_from_compressed_image( 2055 data, w, h, title=title, fmt='gif', pixelated=pixelated, **kwargs 2056 ) 2057 else: 2058 html_string = html_from_compressed_video( 2059 data, w, h, title=title, **kwargs 2060 ) 2061 html_strings.append(html_string) 2062 2063 # Create single-row tables each with no more than 'columns' elements. 2064 table_strings = [] 2065 for row_html_strings in _chunked(html_strings, columns): 2066 td = '<td style="padding:1px;">' 2067 s = ''.join(f'{td}{e}</td>' for e in row_html_strings) 2068 if ylabel: 2069 style = 'writing-mode:vertical-lr; transform:rotate(180deg);' 2070 s = f'{td}<span style="{style}">{ylabel}</span></td>' + s 2071 table_strings.append( 2072 f'<table class="{html_class}"' 2073 f' style="border-spacing:0px;"><tr>{s}</tr></table>' 2074 ) 2075 s = ''.join(table_strings) 2076 if return_html: 2077 return s 2078 _display_html(s) 2079 return None
Displays a row of videos in the IPython notebook.
Creates HTML with <video> tags containing embedded H264-encoded bytestrings.
If codec is set to 'gif', we instead use <img> tags containing embedded
GIF-encoded bytestrings. Note that the resulting GIF animations skip frames
when the fps period is not a multiple of 10 ms units (GIF frame delay
units). Encoding at fps = 20.0, 25.0, or 50.0 works fine.
If a directory has been specified using set_show_save_dir, also saves each
titled video to a file in that directory based on its title.
Arguments:
- videos: Iterable of videos, or dictionary of
{title: video}. Each video must be an iterable of images. If a video object has ametadata(VideoMetadata) attribute, itsfpsfield provides a default framerate. - titles: Optional strings shown above the corresponding videos.
- width: Optional, overrides displayed width (in pixels).
- height: Optional, overrides displayed height (in pixels).
- downsample: If True, each video whose width or height is greater than the
specified
widthorheightis resampled to the display resolution. This improves antialiasing and reduces the size of the notebook. - columns: Optional, maximum number of videos per row.
- fps: Frames-per-second framerate (default is 60.0 except 25.0 for GIF).
- bps: Bits-per-second bitrate (default None).
- qp: Quantization parameter for video compression quality (default None).
- codec: Compression algorithm; must be either 'h264' or 'gif'.
- ylabel: Text (rotated by 90 degrees) shown on the left of each row.
- html_class: CSS class name used in definition of HTML element.
- return_html: If
Truereturn the raw HTMLstrinstead of displaying. - **kwargs: Additional parameters (
border,loop,autoplay) forhtml_from_compressed_video.
Returns:
html string if
return_htmlisTrue.
769def read_image( 770 path_or_url: _Path, 771 *, 772 apply_exif_transpose: bool = True, 773 dtype: _DTypeLike = None, 774) -> _NDArray: 775 """Returns an image read from a file path or URL. 776 777 Decoding is performed using `PIL`, which supports `uint8` images with 1, 3, 778 or 4 channels and `uint16` images with a single channel. 779 780 Args: 781 path_or_url: Path of input file. 782 apply_exif_transpose: If True, rotate image according to EXIF orientation. 783 dtype: Data type of the returned array. If None, `np.uint8` or `np.uint16` 784 is inferred automatically. 785 """ 786 data = read_contents(path_or_url) 787 return decompress_image(data, dtype, apply_exif_transpose)
Returns an image read from a file path or URL.
Decoding is performed using PIL, which supports uint8 images with 1, 3,
or 4 channels and uint16 images with a single channel.
Arguments:
- path_or_url: Path of input file.
- apply_exif_transpose: If True, rotate image according to EXIF orientation.
- dtype: Data type of the returned array. If None,
np.uint8ornp.uint16is inferred automatically.
790def write_image( 791 path: _Path, image: _ArrayLike, fmt: str = 'png', **kwargs: Any 792) -> None: 793 """Writes an image to a file. 794 795 Encoding is performed using `PIL`, which supports `uint8` images with 1, 3, 796 or 4 channels and `uint16` images with a single channel. 797 798 File format is explicitly provided by `fmt` and not inferred by `path`. 799 800 Args: 801 path: Path of output file. 802 image: Array-like object. If its type is float, it is converted to np.uint8 803 using `to_uint8` (thus clamping to the input to the range [0.0, 1.0]). 804 Otherwise it must be np.uint8 or np.uint16. 805 fmt: Desired compression encoding, e.g. 'png'. 806 **kwargs: Additional parameters for `PIL.Image.save()`. 807 """ 808 image = _as_valid_media_array(image) 809 if np.issubdtype(image.dtype, np.floating): 810 image = to_uint8(image) 811 with _open(path, 'wb') as f: 812 _pil_image(image).save(f, format=fmt, **kwargs)
Writes an image to a file.
Encoding is performed using PIL, which supports uint8 images with 1, 3,
or 4 channels and uint16 images with a single channel.
File format is explicitly provided by fmt and not inferred by path.
Arguments:
- path: Path of output file.
- image: Array-like object. If its type is float, it is converted to np.uint8
using
to_uint8(thus clamping to the input to the range [0.0, 1.0]). Otherwise it must be np.uint8 or np.uint16. - fmt: Desired compression encoding, e.g. 'png'.
- **kwargs: Additional parameters for
PIL.Image.save().
1804def read_video(path_or_url: _Path, **kwargs: Any) -> _VideoArray: 1805 """Returns an array containing all images read from a compressed video file. 1806 1807 >>> video = read_video('/tmp/river.mp4') 1808 >>> print(f'The framerate is {video.metadata.fps} frames/s.') 1809 >>> show_video(video) 1810 1811 >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4' 1812 >>> show_video(read_video(url)) 1813 1814 Args: 1815 path_or_url: Input video file. 1816 **kwargs: Additional parameters for `VideoReader`. 1817 1818 Returns: 1819 A 4D `numpy` array with dimensions (frame, height, width, channel), or a 3D 1820 array if `output_format` is specified as 'gray'. The returned array has an 1821 attribute `metadata` containing `VideoMetadata` information. This enables 1822 `show_video` to retrieve the framerate in `metadata.fps`. Note that the 1823 metadata attribute is lost in most subsequent `numpy` operations. 1824 """ 1825 with VideoReader(path_or_url, **kwargs) as reader: 1826 return _VideoArray(np.array(tuple(reader)), metadata=reader.metadata)
Returns an array containing all images read from a compressed video file.
>>> video = read_video('/tmp/river.mp4')
>>> print(f'The framerate is {video.metadata.fps} frames/s.')
>>> show_video(video)
>>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
>>> show_video(read_video(url))
Arguments:
- path_or_url: Input video file.
- **kwargs: Additional parameters for
VideoReader.
Returns:
A 4D
numpyarray with dimensions (frame, height, width, channel), or a 3D array ifoutput_formatis specified as 'gray'. The returned array has an attributemetadatacontainingVideoMetadatainformation. This enablesshow_videoto retrieve the framerate inmetadata.fps. Note that the metadata attribute is lost in most subsequentnumpyoperations.
1829def write_video(path: _Path, images: Iterable[_NDArray], **kwargs: Any) -> None: 1830 """Writes images to a compressed video file. 1831 1832 >>> video = moving_circle((480, 640), num_images=60) 1833 >>> write_video('/tmp/v.mp4', video, fps=60, qp=18) 1834 >>> show_video(read_video('/tmp/v.mp4')) 1835 1836 Args: 1837 path: Output video file. 1838 images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D 1839 arrays. 1840 **kwargs: Additional parameters for `VideoWriter`. 1841 """ 1842 first_image, images = _peek_first(images) 1843 shape = first_image.shape[0], first_image.shape[1] 1844 dtype = first_image.dtype 1845 if dtype == bool: 1846 dtype = np.dtype(np.uint8) 1847 elif np.issubdtype(dtype, np.floating): 1848 dtype = np.dtype(np.uint16) 1849 kwargs = {'metadata': getattr(images, 'metadata', None), **kwargs} 1850 with VideoWriter(path, shape=shape, dtype=dtype, **kwargs) as writer: 1851 for image in images: 1852 writer.add_image(image)
Writes images to a compressed video file.
>>> video = moving_circle((480, 640), num_images=60)
>>> write_video('/tmp/v.mp4', video, fps=60, qp=18)
>>> show_video(read_video('/tmp/v.mp4'))
Arguments:
- path: Output video file.
- images: Iterable over video frames, e.g. a 4D array or a list of 2D or 3D arrays.
- **kwargs: Additional parameters for
VideoWriter.
1370class VideoReader(_VideoIO): 1371 """Context to read a compressed video as an iterable over its images. 1372 1373 >>> with VideoReader('/tmp/river.mp4') as reader: 1374 ... print(f'Video has {reader.num_images} images with shape={reader.shape},' 1375 ... f' at {reader.fps} frames/sec and {reader.bps} bits/sec.') 1376 ... for image in reader: 1377 ... print(image.shape) 1378 1379 >>> with VideoReader('/tmp/river.mp4') as reader: 1380 ... video = np.array(tuple(reader)) 1381 1382 >>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4' 1383 >>> with VideoReader(url) as reader: 1384 ... show_video(reader) 1385 1386 Attributes: 1387 path_or_url: Location of input video. 1388 output_format: Format of output images (default 'rgb'). If 'rgb', each 1389 image has shape=(height, width, 3) with R, G, B values. If 'yuv', each 1390 image has shape=(height, width, 3) with Y, U, V values. If 'gray', each 1391 image has shape=(height, width). 1392 dtype: Data type for output images. The default is `np.uint8`. Use of 1393 `np.uint16` allows reading 10-bit or 12-bit data without precision loss. 1394 metadata: Object storing the information retrieved from the video header. 1395 Its attributes are copied as attributes in this class. 1396 num_images: Number of frames that is expected from the video stream. This 1397 is estimated from the framerate and the duration stored in the video 1398 header, so it might be inexact. 1399 shape: The dimensions (height, width) of each video frame. 1400 fps: The framerate in frames per second. 1401 bps: The estimated bitrate of the video stream in bits per second, retrieved 1402 from the video header. 1403 stream_index: The stream index to read from. The default is 0. 1404 """ 1405 1406 path_or_url: _Path 1407 output_format: str 1408 dtype: _DType 1409 metadata: VideoMetadata 1410 num_images: int 1411 shape: tuple[int, int] 1412 fps: float 1413 bps: int | None 1414 stream_index: int 1415 _num_bytes_per_image: int 1416 1417 def __init__( 1418 self, 1419 path_or_url: _Path, 1420 *, 1421 stream_index: int = 0, 1422 output_format: str = 'rgb', 1423 dtype: _DTypeLike = np.uint8, 1424 ): 1425 if output_format not in {'rgb', 'yuv', 'gray'}: 1426 raise ValueError( 1427 f'Output format {output_format} is not rgb, yuv, or gray.' 1428 ) 1429 self.path_or_url = path_or_url 1430 self.output_format = output_format 1431 self.stream_index = stream_index 1432 self.dtype = np.dtype(dtype) 1433 if self.dtype.type not in (np.uint8, np.uint16): 1434 raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.') 1435 self._read_via_local_file: Any = None 1436 self._popen: subprocess.Popen[bytes] | None = None 1437 self._proc: subprocess.Popen[bytes] | None = None 1438 1439 def __enter__(self) -> 'VideoReader': 1440 try: 1441 self._read_via_local_file = _read_via_local_file(self.path_or_url) 1442 # pylint: disable-next=no-member 1443 tmp_name = self._read_via_local_file.__enter__() 1444 1445 self.metadata = _get_video_metadata(tmp_name) 1446 self.num_images, self.shape, self.fps, self.bps = self.metadata 1447 pix_fmt = self._get_pix_fmt(self.dtype, self.output_format) 1448 num_channels = {'rgb': 3, 'yuv': 3, 'gray': 1}[self.output_format] 1449 bytes_per_channel = self.dtype.itemsize 1450 self._num_bytes_per_image = ( 1451 math.prod(self.shape) * num_channels * bytes_per_channel 1452 ) 1453 1454 command = [ 1455 '-v', 1456 'panic', 1457 '-nostdin', 1458 '-i', 1459 tmp_name, 1460 '-vcodec', 1461 'rawvideo', 1462 '-f', 1463 'image2pipe', 1464 '-map', 1465 f'0:v:{self.stream_index}', 1466 '-pix_fmt', 1467 pix_fmt, 1468 '-vsync', 1469 'vfr', 1470 '-', 1471 ] 1472 self._popen = _run_ffmpeg( 1473 command, 1474 stdout=subprocess.PIPE, 1475 stderr=subprocess.PIPE, 1476 allowed_input_files=[tmp_name], 1477 ) 1478 self._proc = self._popen.__enter__() 1479 except Exception: 1480 self.__exit__(None, None, None) 1481 raise 1482 return self 1483 1484 def __exit__(self, *_: Any) -> None: 1485 self.close() 1486 1487 def read(self) -> _NDArray | None: 1488 """Reads a video image frame (or None if at end of file). 1489 1490 Returns: 1491 A numpy array in the format specified by `output_format`, i.e., a 3D 1492 array with 3 color channels, except for format 'gray' which is 2D. 1493 """ 1494 assert self._proc, 'Error: reading from an already closed context.' 1495 stdout = self._proc.stdout 1496 assert stdout is not None 1497 data = stdout.read(self._num_bytes_per_image) 1498 if not data: # Due to either end-of-file or subprocess error. 1499 self.close() # Raises exception if subprocess had error. 1500 return None # To indicate end-of-file. 1501 assert len(data) == self._num_bytes_per_image 1502 image = np.frombuffer(data, dtype=self.dtype) 1503 if self.output_format == 'rgb': 1504 image = image.reshape(*self.shape, 3) 1505 elif self.output_format == 'yuv': # Convert from planar YUV to pixel YUV. 1506 image = np.moveaxis(image.reshape(3, *self.shape), 0, 2) 1507 elif self.output_format == 'gray': # Generate 2D rather than 3D ndimage. 1508 image = image.reshape(*self.shape) 1509 else: 1510 raise AssertionError 1511 return image 1512 1513 def __iter__(self) -> Iterator[_NDArray]: 1514 while True: 1515 image = self.read() 1516 if image is None: 1517 return 1518 yield image 1519 1520 def close(self) -> None: 1521 """Terminates video reader. (Called automatically at end of context.)""" 1522 if self._popen: 1523 self._popen.__exit__(None, None, None) 1524 self._popen = None 1525 self._proc = None 1526 if self._read_via_local_file: 1527 # pylint: disable-next=no-member 1528 self._read_via_local_file.__exit__(None, None, None) 1529 self._read_via_local_file = None
Context to read a compressed video as an iterable over its images.
>>> with VideoReader('/tmp/river.mp4') as reader:
... print(f'Video has {reader.num_images} images with shape={reader.shape},'
... f' at {reader.fps} frames/sec and {reader.bps} bits/sec.')
... for image in reader:
... print(image.shape)
>>> with VideoReader('/tmp/river.mp4') as reader:
... video = np.array(tuple(reader))
>>> url = 'https://github.com/hhoppe/data/raw/main/video.mp4'
>>> with VideoReader(url) as reader:
... show_video(reader)
Attributes:
- path_or_url: Location of input video.
- output_format: Format of output images (default 'rgb'). If 'rgb', each image has shape=(height, width, 3) with R, G, B values. If 'yuv', each image has shape=(height, width, 3) with Y, U, V values. If 'gray', each image has shape=(height, width).
- dtype: Data type for output images. The default is
np.uint8. Use ofnp.uint16allows reading 10-bit or 12-bit data without precision loss. - metadata: Object storing the information retrieved from the video header. Its attributes are copied as attributes in this class.
- num_images: Number of frames that is expected from the video stream. This is estimated from the framerate and the duration stored in the video header, so it might be inexact.
- shape: The dimensions (height, width) of each video frame.
- fps: The framerate in frames per second.
- bps: The estimated bitrate of the video stream in bits per second, retrieved from the video header.
- stream_index: The stream index to read from. The default is 0.
1417 def __init__( 1418 self, 1419 path_or_url: _Path, 1420 *, 1421 stream_index: int = 0, 1422 output_format: str = 'rgb', 1423 dtype: _DTypeLike = np.uint8, 1424 ): 1425 if output_format not in {'rgb', 'yuv', 'gray'}: 1426 raise ValueError( 1427 f'Output format {output_format} is not rgb, yuv, or gray.' 1428 ) 1429 self.path_or_url = path_or_url 1430 self.output_format = output_format 1431 self.stream_index = stream_index 1432 self.dtype = np.dtype(dtype) 1433 if self.dtype.type not in (np.uint8, np.uint16): 1434 raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.') 1435 self._read_via_local_file: Any = None 1436 self._popen: subprocess.Popen[bytes] | None = None 1437 self._proc: subprocess.Popen[bytes] | None = None
1487 def read(self) -> _NDArray | None: 1488 """Reads a video image frame (or None if at end of file). 1489 1490 Returns: 1491 A numpy array in the format specified by `output_format`, i.e., a 3D 1492 array with 3 color channels, except for format 'gray' which is 2D. 1493 """ 1494 assert self._proc, 'Error: reading from an already closed context.' 1495 stdout = self._proc.stdout 1496 assert stdout is not None 1497 data = stdout.read(self._num_bytes_per_image) 1498 if not data: # Due to either end-of-file or subprocess error. 1499 self.close() # Raises exception if subprocess had error. 1500 return None # To indicate end-of-file. 1501 assert len(data) == self._num_bytes_per_image 1502 image = np.frombuffer(data, dtype=self.dtype) 1503 if self.output_format == 'rgb': 1504 image = image.reshape(*self.shape, 3) 1505 elif self.output_format == 'yuv': # Convert from planar YUV to pixel YUV. 1506 image = np.moveaxis(image.reshape(3, *self.shape), 0, 2) 1507 elif self.output_format == 'gray': # Generate 2D rather than 3D ndimage. 1508 image = image.reshape(*self.shape) 1509 else: 1510 raise AssertionError 1511 return image
Reads a video image frame (or None if at end of file).
Returns:
A numpy array in the format specified by
output_format, i.e., a 3D array with 3 color channels, except for format 'gray' which is 2D.
1520 def close(self) -> None: 1521 """Terminates video reader. (Called automatically at end of context.)""" 1522 if self._popen: 1523 self._popen.__exit__(None, None, None) 1524 self._popen = None 1525 self._proc = None 1526 if self._read_via_local_file: 1527 # pylint: disable-next=no-member 1528 self._read_via_local_file.__exit__(None, None, None) 1529 self._read_via_local_file = None
Terminates video reader. (Called automatically at end of context.)
1532class VideoWriter(_VideoIO): 1533 """Context to write a compressed video. 1534 1535 >>> shape = 480, 640 1536 >>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer: 1537 ... for image in moving_circle(shape, num_images=60): 1538 ... writer.add_image(image) 1539 >>> show_video(read_video('/tmp/v.mp4')) 1540 1541 1542 Bitrate control may be specified using at most one of: `bps`, `qp`, or `crf`. 1543 If none are specified, `qp` is set to a default value. 1544 See https://slhck.info/video/2017/03/01/rate-control.html 1545 1546 If codec is 'gif', the args `bps`, `qp`, `crf`, and `encoded_format` are 1547 ignored. 1548 1549 Attributes: 1550 path: Output video. Its suffix (e.g. '.mp4') determines the video container 1551 format. The suffix must be '.gif' if the codec is 'gif'. 1552 shape: 2D spatial dimensions (height, width) of video image frames. The 1553 dimensions must be even if 'encoded_format' has subsampled chroma (e.g., 1554 'yuv420p' or 'yuv420p10le'). 1555 codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264', 1556 'hevc', 'vp9', or 'gif'). 1557 metadata: Optional VideoMetadata object whose `fps` and `bps` attributes are 1558 used if not specified as explicit parameters. 1559 fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif'). 1560 bps: Requested average bits-per-second bitrate (default None). 1561 qp: Quantization parameter for video compression quality (default None). 1562 crf: Constant rate factor for video compression quality (default None). 1563 ffmpeg_args: Additional arguments for `ffmpeg` command, e.g. '-g 30' to 1564 introduce I-frames, or '-bf 0' to omit B-frames. 1565 input_format: Format of input images (default 'rgb'). If 'rgb', each image 1566 has shape=(height, width, 3) or (height, width). If 'yuv', each image has 1567 shape=(height, width, 3) with Y, U, V values. If 'gray', each image has 1568 shape=(height, width). 1569 dtype: Expected data type for input images (any float input images are 1570 converted to `dtype`). The default is `np.uint8`. Use of `np.uint16` is 1571 necessary when encoding >8 bits/channel. 1572 encoded_format: Pixel format as defined by `ffmpeg -pix_fmts`, e.g., 1573 'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma), 1574 'yuv420p10le' (10-bit per channel), etc. The default (None) selects 1575 'yuv420p' if all shape dimensions are even, else 'yuv444p'. 1576 """ 1577 1578 def __init__( 1579 self, 1580 path: _Path, 1581 shape: tuple[int, int], 1582 *, 1583 codec: str = 'h264', 1584 metadata: VideoMetadata | None = None, 1585 fps: float | None = None, 1586 bps: int | None = None, 1587 qp: int | None = None, 1588 crf: float | None = None, 1589 ffmpeg_args: str | Sequence[str] = '', 1590 input_format: str = 'rgb', 1591 dtype: _DTypeLike = np.uint8, 1592 encoded_format: str | None = None, 1593 ) -> None: 1594 _check_2d_shape(shape) 1595 if fps is None and metadata: 1596 fps = metadata.fps 1597 if fps is None: 1598 fps = 25.0 if codec == 'gif' else 60.0 1599 if fps <= 0.0: 1600 raise ValueError(f'Frame-per-second value {fps} is invalid.') 1601 if bps is None and metadata: 1602 bps = metadata.bps 1603 bps = int(bps) if bps is not None else None 1604 if bps is not None and bps <= 0: 1605 raise ValueError(f'Bitrate value {bps} is invalid.') 1606 if qp is not None and (not isinstance(qp, int) or qp < 0): 1607 raise ValueError( 1608 f'Quantization parameter {qp} cannot be negative. It must be a' 1609 ' non-negative integer.' 1610 ) 1611 num_rate_specifications = sum(x is not None for x in (bps, qp, crf)) 1612 if num_rate_specifications > 1: 1613 raise ValueError( 1614 f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).' 1615 ) 1616 ffmpeg_args = ( 1617 shlex.split(ffmpeg_args) 1618 if isinstance(ffmpeg_args, str) 1619 else list(ffmpeg_args) 1620 ) 1621 if input_format not in {'rgb', 'yuv', 'gray'}: 1622 raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.') 1623 dtype = np.dtype(dtype) 1624 if dtype.type not in (np.uint8, np.uint16): 1625 raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.') 1626 self.path = pathlib.Path(path) 1627 self.shape = shape 1628 all_dimensions_are_even = all(dim % 2 == 0 for dim in shape) 1629 if encoded_format is None: 1630 encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p' 1631 if not all_dimensions_are_even and encoded_format.startswith( 1632 ('yuv42', 'yuvj42') 1633 ): 1634 raise ValueError( 1635 f'With encoded_format {encoded_format}, video dimensions must be' 1636 f' even, but shape is {shape}.' 1637 ) 1638 self.fps = fps 1639 self.codec = codec 1640 self.bps = bps 1641 self.qp = qp 1642 self.crf = crf 1643 self.ffmpeg_args = ffmpeg_args 1644 self.input_format = input_format 1645 self.dtype = dtype 1646 self.encoded_format = encoded_format 1647 if num_rate_specifications == 0 and not ffmpeg_args: 1648 qp = 20 if math.prod(self.shape) <= 640 * 480 else 28 1649 self._bitrate_args = ( 1650 (['-vb', f'{bps}'] if bps is not None else []) 1651 + (['-qp', f'{qp}'] if qp is not None else []) 1652 + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else []) 1653 ) 1654 if self.codec == 'gif': 1655 if self.path.suffix != '.gif': 1656 raise ValueError(f"File '{self.path}' does not have a .gif suffix.") 1657 self.encoded_format = 'pal8' 1658 self._bitrate_args = [] 1659 video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse' 1660 # Less common (and likely less useful) is a per-frame color palette: 1661 # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];' 1662 # '[s1][p]paletteuse=new=1') 1663 self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args 1664 self._write_via_local_file: Any = None 1665 self._popen: subprocess.Popen[bytes] | None = None 1666 self._proc: subprocess.Popen[bytes] | None = None 1667 1668 def __enter__(self) -> 'VideoWriter': 1669 input_pix_fmt = self._get_pix_fmt(self.dtype, self.input_format) 1670 try: 1671 self._write_via_local_file = _write_via_local_file(self.path) 1672 # pylint: disable-next=no-member 1673 tmp_name = self._write_via_local_file.__enter__() 1674 1675 # Writing to stdout using ('-f', 'mp4', '-') would require 1676 # ('-movflags', 'frag_keyframe+empty_moov') and the result is nonportable. 1677 height, width = self.shape 1678 command = ( 1679 [ 1680 '-v', 1681 'error', 1682 '-f', 1683 'rawvideo', 1684 '-vcodec', 1685 'rawvideo', 1686 '-pix_fmt', 1687 input_pix_fmt, 1688 '-s', 1689 f'{width}x{height}', 1690 '-r', 1691 f'{self.fps}', 1692 '-i', 1693 '-', 1694 '-an', 1695 '-vcodec', 1696 self.codec, 1697 '-pix_fmt', 1698 self.encoded_format, 1699 ] 1700 + self._bitrate_args 1701 + self.ffmpeg_args 1702 + ['-y', tmp_name] 1703 ) 1704 self._popen = _run_ffmpeg( 1705 command, 1706 stdin=subprocess.PIPE, 1707 stderr=subprocess.PIPE, 1708 allowed_output_files=[tmp_name], 1709 ) 1710 self._proc = self._popen.__enter__() 1711 except Exception: 1712 self.__exit__(None, None, None) 1713 raise 1714 return self 1715 1716 def __exit__(self, *_: Any) -> None: 1717 self.close() 1718 1719 def add_image(self, image: _NDArray) -> None: 1720 """Writes a video frame. 1721 1722 Args: 1723 image: Array whose dtype and first two dimensions must match the `dtype` 1724 and `shape` specified in `VideoWriter` initialization. If 1725 `input_format` is 'gray', the image must be 2D. For the 'rgb' 1726 input_format, the image may be either 2D (interpreted as grayscale) or 1727 3D with three (R, G, B) channels. For the 'yuv' input_format, the image 1728 must be 3D with three (Y, U, V) channels. 1729 1730 Raises: 1731 RuntimeError: If there is an error writing to the output file. 1732 """ 1733 assert self._proc, 'Error: writing to an already closed context.' 1734 if issubclass(image.dtype.type, (np.floating, np.bool_)): 1735 image = to_type(image, self.dtype) 1736 if image.dtype != self.dtype: 1737 raise ValueError(f'Image type {image.dtype} != {self.dtype}.') 1738 if self.input_format == 'gray': 1739 if image.ndim != 2: 1740 raise ValueError(f'Image dimensions {image.shape} are not 2D.') 1741 else: 1742 if image.ndim == 2 and self.input_format == 'rgb': 1743 image = np.dstack((image, image, image)) 1744 if not (image.ndim == 3 and image.shape[2] == 3): 1745 raise ValueError(f'Image dimensions {image.shape} are invalid.') 1746 if image.shape[:2] != self.shape: 1747 raise ValueError( 1748 f'Image dimensions {image.shape[:2]} do not match' 1749 f' those of the initialized video {self.shape}.' 1750 ) 1751 if self.input_format == 'yuv': # Convert from per-pixel YUV to planar YUV. 1752 image = np.moveaxis(image, 2, 0) 1753 data = image.tobytes() 1754 stdin = self._proc.stdin 1755 assert stdin is not None 1756 if stdin.write(data) != len(data): 1757 self._proc.wait() 1758 stderr = self._proc.stderr 1759 assert stderr is not None 1760 s = stderr.read().decode('utf-8') 1761 raise RuntimeError(f"Error writing '{self.path}': {s}") 1762 1763 def close(self) -> None: 1764 """Finishes writing the video. (Called automatically at end of context.)""" 1765 if self._popen: 1766 assert self._proc, 'Error: closing an already closed context.' 1767 stdin = self._proc.stdin 1768 assert stdin is not None 1769 stdin.close() 1770 if self._proc.wait(): 1771 stderr = self._proc.stderr 1772 assert stderr is not None 1773 s = stderr.read().decode('utf-8') 1774 raise RuntimeError(f"Error writing '{self.path}': {s}") 1775 self._popen.__exit__(None, None, None) 1776 self._popen = None 1777 self._proc = None 1778 if self._write_via_local_file: 1779 # pylint: disable-next=no-member 1780 self._write_via_local_file.__exit__(None, None, None) 1781 self._write_via_local_file = None
Context to write a compressed video.
>>> shape = 480, 640
>>> with VideoWriter('/tmp/v.mp4', shape, fps=60) as writer:
... for image in moving_circle(shape, num_images=60):
... writer.add_image(image)
>>> show_video(read_video('/tmp/v.mp4'))
Bitrate control may be specified using at most one of: bps, qp, or crf.
If none are specified, qp is set to a default value.
See https://slhck.info/video/2017/03/01/rate-control.html
If codec is 'gif', the args bps, qp, crf, and encoded_format are
ignored.
Attributes:
- path: Output video. Its suffix (e.g. '.mp4') determines the video container format. The suffix must be '.gif' if the codec is 'gif'.
- shape: 2D spatial dimensions (height, width) of video image frames. The dimensions must be even if 'encoded_format' has subsampled chroma (e.g., 'yuv420p' or 'yuv420p10le').
- codec: Compression algorithm as defined by "ffmpeg -codecs" (e.g., 'h264', 'hevc', 'vp9', or 'gif').
- metadata: Optional VideoMetadata object whose
fpsandbpsattributes are used if not specified as explicit parameters. - fps: Frames-per-second framerate (default is 60.0 except 25.0 for 'gif').
- bps: Requested average bits-per-second bitrate (default None).
- qp: Quantization parameter for video compression quality (default None).
- crf: Constant rate factor for video compression quality (default None).
- ffmpeg_args: Additional arguments for
ffmpegcommand, e.g. '-g 30' to introduce I-frames, or '-bf 0' to omit B-frames. - input_format: Format of input images (default 'rgb'). If 'rgb', each image has shape=(height, width, 3) or (height, width). If 'yuv', each image has shape=(height, width, 3) with Y, U, V values. If 'gray', each image has shape=(height, width).
- dtype: Expected data type for input images (any float input images are
converted to
dtype). The default isnp.uint8. Use ofnp.uint16is necessary when encoding >8 bits/channel. - encoded_format: Pixel format as defined by
ffmpeg -pix_fmts, e.g., 'yuv420p' (2x2-subsampled chroma), 'yuv444p' (full-res chroma), 'yuv420p10le' (10-bit per channel), etc. The default (None) selects 'yuv420p' if all shape dimensions are even, else 'yuv444p'.
1578 def __init__( 1579 self, 1580 path: _Path, 1581 shape: tuple[int, int], 1582 *, 1583 codec: str = 'h264', 1584 metadata: VideoMetadata | None = None, 1585 fps: float | None = None, 1586 bps: int | None = None, 1587 qp: int | None = None, 1588 crf: float | None = None, 1589 ffmpeg_args: str | Sequence[str] = '', 1590 input_format: str = 'rgb', 1591 dtype: _DTypeLike = np.uint8, 1592 encoded_format: str | None = None, 1593 ) -> None: 1594 _check_2d_shape(shape) 1595 if fps is None and metadata: 1596 fps = metadata.fps 1597 if fps is None: 1598 fps = 25.0 if codec == 'gif' else 60.0 1599 if fps <= 0.0: 1600 raise ValueError(f'Frame-per-second value {fps} is invalid.') 1601 if bps is None and metadata: 1602 bps = metadata.bps 1603 bps = int(bps) if bps is not None else None 1604 if bps is not None and bps <= 0: 1605 raise ValueError(f'Bitrate value {bps} is invalid.') 1606 if qp is not None and (not isinstance(qp, int) or qp < 0): 1607 raise ValueError( 1608 f'Quantization parameter {qp} cannot be negative. It must be a' 1609 ' non-negative integer.' 1610 ) 1611 num_rate_specifications = sum(x is not None for x in (bps, qp, crf)) 1612 if num_rate_specifications > 1: 1613 raise ValueError( 1614 f'Must specify at most one of bps, qp, or crf ({bps}, {qp}, {crf}).' 1615 ) 1616 ffmpeg_args = ( 1617 shlex.split(ffmpeg_args) 1618 if isinstance(ffmpeg_args, str) 1619 else list(ffmpeg_args) 1620 ) 1621 if input_format not in {'rgb', 'yuv', 'gray'}: 1622 raise ValueError(f'Input format {input_format} is not rgb, yuv, or gray.') 1623 dtype = np.dtype(dtype) 1624 if dtype.type not in (np.uint8, np.uint16): 1625 raise ValueError(f'Type {dtype} is not np.uint8 or np.uint16.') 1626 self.path = pathlib.Path(path) 1627 self.shape = shape 1628 all_dimensions_are_even = all(dim % 2 == 0 for dim in shape) 1629 if encoded_format is None: 1630 encoded_format = 'yuv420p' if all_dimensions_are_even else 'yuv444p' 1631 if not all_dimensions_are_even and encoded_format.startswith( 1632 ('yuv42', 'yuvj42') 1633 ): 1634 raise ValueError( 1635 f'With encoded_format {encoded_format}, video dimensions must be' 1636 f' even, but shape is {shape}.' 1637 ) 1638 self.fps = fps 1639 self.codec = codec 1640 self.bps = bps 1641 self.qp = qp 1642 self.crf = crf 1643 self.ffmpeg_args = ffmpeg_args 1644 self.input_format = input_format 1645 self.dtype = dtype 1646 self.encoded_format = encoded_format 1647 if num_rate_specifications == 0 and not ffmpeg_args: 1648 qp = 20 if math.prod(self.shape) <= 640 * 480 else 28 1649 self._bitrate_args = ( 1650 (['-vb', f'{bps}'] if bps is not None else []) 1651 + (['-qp', f'{qp}'] if qp is not None else []) 1652 + (['-vb', '0', '-crf', f'{crf}'] if crf is not None else []) 1653 ) 1654 if self.codec == 'gif': 1655 if self.path.suffix != '.gif': 1656 raise ValueError(f"File '{self.path}' does not have a .gif suffix.") 1657 self.encoded_format = 'pal8' 1658 self._bitrate_args = [] 1659 video_filter = 'split[s0][s1];[s0]palettegen[p];[s1][p]paletteuse' 1660 # Less common (and likely less useful) is a per-frame color palette: 1661 # video_filter = ('split[s0][s1];[s0]palettegen=stats_mode=single[p];' 1662 # '[s1][p]paletteuse=new=1') 1663 self.ffmpeg_args = ['-vf', video_filter, '-f', 'gif'] + self.ffmpeg_args 1664 self._write_via_local_file: Any = None 1665 self._popen: subprocess.Popen[bytes] | None = None 1666 self._proc: subprocess.Popen[bytes] | None = None
1719 def add_image(self, image: _NDArray) -> None: 1720 """Writes a video frame. 1721 1722 Args: 1723 image: Array whose dtype and first two dimensions must match the `dtype` 1724 and `shape` specified in `VideoWriter` initialization. If 1725 `input_format` is 'gray', the image must be 2D. For the 'rgb' 1726 input_format, the image may be either 2D (interpreted as grayscale) or 1727 3D with three (R, G, B) channels. For the 'yuv' input_format, the image 1728 must be 3D with three (Y, U, V) channels. 1729 1730 Raises: 1731 RuntimeError: If there is an error writing to the output file. 1732 """ 1733 assert self._proc, 'Error: writing to an already closed context.' 1734 if issubclass(image.dtype.type, (np.floating, np.bool_)): 1735 image = to_type(image, self.dtype) 1736 if image.dtype != self.dtype: 1737 raise ValueError(f'Image type {image.dtype} != {self.dtype}.') 1738 if self.input_format == 'gray': 1739 if image.ndim != 2: 1740 raise ValueError(f'Image dimensions {image.shape} are not 2D.') 1741 else: 1742 if image.ndim == 2 and self.input_format == 'rgb': 1743 image = np.dstack((image, image, image)) 1744 if not (image.ndim == 3 and image.shape[2] == 3): 1745 raise ValueError(f'Image dimensions {image.shape} are invalid.') 1746 if image.shape[:2] != self.shape: 1747 raise ValueError( 1748 f'Image dimensions {image.shape[:2]} do not match' 1749 f' those of the initialized video {self.shape}.' 1750 ) 1751 if self.input_format == 'yuv': # Convert from per-pixel YUV to planar YUV. 1752 image = np.moveaxis(image, 2, 0) 1753 data = image.tobytes() 1754 stdin = self._proc.stdin 1755 assert stdin is not None 1756 if stdin.write(data) != len(data): 1757 self._proc.wait() 1758 stderr = self._proc.stderr 1759 assert stderr is not None 1760 s = stderr.read().decode('utf-8') 1761 raise RuntimeError(f"Error writing '{self.path}': {s}")
Writes a video frame.
Arguments:
- image: Array whose dtype and first two dimensions must match the
dtypeandshapespecified inVideoWriterinitialization. Ifinput_formatis 'gray', the image must be 2D. For the 'rgb' input_format, the image may be either 2D (interpreted as grayscale) or 3D with three (R, G, B) channels. For the 'yuv' input_format, the image must be 3D with three (Y, U, V) channels.
Raises:
- RuntimeError: If there is an error writing to the output file.
1763 def close(self) -> None: 1764 """Finishes writing the video. (Called automatically at end of context.)""" 1765 if self._popen: 1766 assert self._proc, 'Error: closing an already closed context.' 1767 stdin = self._proc.stdin 1768 assert stdin is not None 1769 stdin.close() 1770 if self._proc.wait(): 1771 stderr = self._proc.stderr 1772 assert stderr is not None 1773 s = stderr.read().decode('utf-8') 1774 raise RuntimeError(f"Error writing '{self.path}': {s}") 1775 self._popen.__exit__(None, None, None) 1776 self._popen = None 1777 self._proc = None 1778 if self._write_via_local_file: 1779 # pylint: disable-next=no-member 1780 self._write_via_local_file.__exit__(None, None, None) 1781 self._write_via_local_file = None
Finishes writing the video. (Called automatically at end of context.)
1265class VideoMetadata(NamedTuple): 1266 """Represents the data stored in a video container header. 1267 1268 Attributes: 1269 num_images: Number of frames that is expected from the video stream. This 1270 is estimated from the framerate and the duration stored in the video 1271 header, so it might be inexact. We set the value to -1 if number of 1272 frames is not found in the header. 1273 shape: The dimensions (height, width) of each video frame. 1274 fps: The framerate in frames per second. 1275 bps: The estimated bitrate of the video stream in bits per second, retrieved 1276 from the video header. 1277 """ 1278 1279 num_images: int 1280 shape: tuple[int, int] 1281 fps: float 1282 bps: int | None
Represents the data stored in a video container header.
Attributes:
- num_images: Number of frames that is expected from the video stream. This is estimated from the framerate and the duration stored in the video header, so it might be inexact. We set the value to -1 if number of frames is not found in the header.
- shape: The dimensions (height, width) of each video frame.
- fps: The framerate in frames per second.
- bps: The estimated bitrate of the video stream in bits per second, retrieved from the video header.
859def compress_image( 860 image: _ArrayLike, *, fmt: str = 'png', **kwargs: Any 861) -> bytes: 862 """Returns a buffer containing a compressed image. 863 864 Args: 865 image: Array in a format supported by `PIL`, e.g. np.uint8 or np.uint16. 866 fmt: Desired compression encoding, e.g. 'png'. 867 **kwargs: Options for `PIL.save()`, e.g. `optimize=True` for greater 868 compression. 869 """ 870 image = _as_valid_media_array(image) 871 with io.BytesIO() as output: 872 _pil_image(image).save(output, format=fmt, **kwargs) 873 return output.getvalue()
Returns a buffer containing a compressed image.
Arguments:
- image: Array in a format supported by
PIL, e.g. np.uint8 or np.uint16. - fmt: Desired compression encoding, e.g. 'png'.
- **kwargs: Options for
PIL.save(), e.g.optimize=Truefor greater compression.
876def decompress_image( 877 data: bytes, dtype: _DTypeLike = None, apply_exif_transpose: bool = True 878) -> _NDArray: 879 """Returns an image from a compressed data buffer. 880 881 Decoding is performed using `PIL`, which supports `uint8` images with 1, 3, 882 or 4 channels and `uint16` images with a single channel. 883 884 Args: 885 data: Buffer containing compressed image. 886 dtype: Data type of the returned array. If None, `np.uint8` or `np.uint16` 887 is inferred automatically. 888 apply_exif_transpose: If True, rotate image according to EXIF orientation. 889 """ 890 pil_image: PIL.Image.Image = PIL.Image.open(io.BytesIO(data)) 891 if apply_exif_transpose: 892 tmp_image = PIL.ImageOps.exif_transpose(pil_image) # Future: in_place=True. 893 assert tmp_image 894 pil_image = tmp_image 895 if dtype is None: 896 dtype = np.uint16 if pil_image.mode.startswith('I') else np.uint8 897 return np.array(pil_image, dtype=dtype)
Returns an image from a compressed data buffer.
Decoding is performed using PIL, which supports uint8 images with 1, 3,
or 4 channels and uint16 images with a single channel.
Arguments:
- data: Buffer containing compressed image.
- dtype: Data type of the returned array. If None,
np.uint8ornp.uint16is inferred automatically. - apply_exif_transpose: If True, rotate image according to EXIF orientation.
1855def compress_video( 1856 images: Iterable[_NDArray], *, codec: str = 'h264', **kwargs: Any 1857) -> bytes: 1858 """Returns a buffer containing a compressed video. 1859 1860 The video container is 'gif' for 'gif' codec, 'webm' for 'vp9' codec, 1861 and mp4 otherwise. 1862 1863 >>> video = read_video('/tmp/river.mp4') 1864 >>> data = compress_video(video, bps=10_000_000) 1865 >>> print(len(data)) 1866 1867 >>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10) 1868 1869 Args: 1870 images: Iterable over video frames. 1871 codec: Compression algorithm as defined by `ffmpeg -codecs` (e.g., 'h264', 1872 'hevc', 'vp9', or 'gif'). 1873 **kwargs: Additional parameters for `VideoWriter`. 1874 1875 Returns: 1876 A bytes buffer containing the compressed video. 1877 """ 1878 suffix = _filename_suffix_from_codec(codec) 1879 with tempfile.TemporaryDirectory() as directory_name: 1880 tmp_path = pathlib.Path(directory_name) / f'file{suffix}' 1881 write_video(tmp_path, images, codec=codec, **kwargs) 1882 return tmp_path.read_bytes()
Returns a buffer containing a compressed video.
The video container is 'gif' for 'gif' codec, 'webm' for 'vp9' codec, and mp4 otherwise.
>>> video = read_video('/tmp/river.mp4')
>>> data = compress_video(video, bps=10_000_000)
>>> print(len(data))
>>> data = compress_video(moving_circle((100, 100), num_images=10), fps=10)
Arguments:
- images: Iterable over video frames.
- codec: Compression algorithm as defined by
ffmpeg -codecs(e.g., 'h264', 'hevc', 'vp9', or 'gif'). - **kwargs: Additional parameters for
VideoWriter.
Returns:
A bytes buffer containing the compressed video.
1885def decompress_video(data: bytes, **kwargs: Any) -> _NDArray: 1886 """Returns video images from an MP4-compressed data buffer.""" 1887 with tempfile.TemporaryDirectory() as directory_name: 1888 tmp_path = pathlib.Path(directory_name) / 'file.mp4' 1889 tmp_path.write_bytes(data) 1890 return read_video(tmp_path, **kwargs)
Returns video images from an MP4-compressed data buffer.
900def html_from_compressed_image( 901 data: bytes, 902 width: int, 903 height: int, 904 *, 905 title: str | None = None, 906 border: bool | str = False, 907 pixelated: bool = True, 908 fmt: str = 'png', 909) -> str: 910 """Returns an HTML string with an image tag containing encoded data. 911 912 Args: 913 data: Compressed image bytes. 914 width: Width of HTML image in pixels. 915 height: Height of HTML image in pixels. 916 title: Optional text shown centered above image. 917 border: If `bool`, whether to place a black boundary around the image, or if 918 `str`, the boundary CSS style. 919 pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'. 920 fmt: Compression encoding. 921 """ 922 b64 = base64.b64encode(data).decode('utf-8') 923 if isinstance(border, str): 924 border = f'{border}; ' 925 elif border: 926 border = 'border:1px solid black; ' 927 else: 928 border = '' 929 s_pixelated = 'pixelated' if pixelated else 'auto' 930 s = ( 931 f'<img width="{width}" height="{height}"' 932 f' style="{border}image-rendering:{s_pixelated}; object-fit:cover;"' 933 f' src="data:image/{fmt};base64,{b64}"/>' 934 ) 935 if title is not None: 936 s = f"""<div style="display:flex; align-items:left;"> 937 <div style="display:flex; flex-direction:column; align-items:center;"> 938 <div>{title}</div><div>{s}</div></div></div>""" 939 return s
Returns an HTML string with an image tag containing encoded data.
Arguments:
- data: Compressed image bytes.
- width: Width of HTML image in pixels.
- height: Height of HTML image in pixels.
- title: Optional text shown centered above image.
- border: If
bool, whether to place a black boundary around the image, or ifstr, the boundary CSS style. - pixelated: If True, sets the CSS style to 'image-rendering: pixelated;'.
- fmt: Compression encoding.
1893def html_from_compressed_video( 1894 data: bytes, 1895 width: int, 1896 height: int, 1897 *, 1898 title: str | None = None, 1899 border: bool | str = False, 1900 loop: bool = True, 1901 autoplay: bool = True, 1902) -> str: 1903 """Returns an HTML string with a video tag containing H264-encoded data. 1904 1905 Args: 1906 data: MP4-compressed video bytes. 1907 width: Width of HTML video in pixels. 1908 height: Height of HTML video in pixels. 1909 title: Optional text shown centered above the video. 1910 border: If `bool`, whether to place a black boundary around the image, or if 1911 `str`, the boundary CSS style. 1912 loop: If True, the playback repeats forever. 1913 autoplay: If True, video playback starts without having to click. 1914 """ 1915 b64 = base64.b64encode(data).decode('utf-8') 1916 if isinstance(border, str): 1917 border = f'{border}; ' 1918 elif border: 1919 border = 'border:1px solid black; ' 1920 else: 1921 border = '' 1922 options = ( 1923 f'controls width="{width}" height="{height}"' 1924 f' style="{border}object-fit:cover;"' 1925 f'{" loop" if loop else ""}' 1926 f'{" autoplay muted" if autoplay else ""}' 1927 ) 1928 s = f"""<video {options}> 1929 <source src="data:video/mp4;base64,{b64}" type="video/mp4"/> 1930 This browser does not support the video tag. 1931 </video>""" 1932 if title is not None: 1933 s = f"""<div style="display:flex; align-items:left;"> 1934 <div style="display:flex; flex-direction:column; align-items:center;"> 1935 <div>{title}</div><div>{s}</div></div></div>""" 1936 return s
Returns an HTML string with a video tag containing H264-encoded data.
Arguments:
- data: MP4-compressed video bytes.
- width: Width of HTML video in pixels.
- height: Height of HTML video in pixels.
- title: Optional text shown centered above the video.
- border: If
bool, whether to place a black boundary around the image, or ifstr, the boundary CSS style. - loop: If True, the playback repeats forever.
- autoplay: If True, video playback starts without having to click.
615def resize_image(image: _ArrayLike, shape: tuple[int, int]) -> _NDArray: 616 """Resizes image to specified spatial dimensions using a Lanczos filter. 617 618 Args: 619 image: Array-like 2D or 3D object, where dtype is uint or floating-point. 620 shape: 2D spatial dimensions (height, width) of output image. 621 622 Returns: 623 A resampled image whose spatial dimensions match `shape`. 624 """ 625 image = _as_valid_media_array(image) 626 if image.ndim not in (2, 3): 627 raise ValueError(f'Image shape {image.shape} is neither 2D nor 3D.') 628 _check_2d_shape(shape) 629 630 # A PIL image can be multichannel only if it has 3 or 4 uint8 channels, 631 # and it can be resized only if it is uint8 or float32. 632 supported_single_channel = ( 633 np.issubdtype(image.dtype, np.floating) or image.dtype == np.uint8 634 ) and image.ndim == 2 635 supported_multichannel = ( 636 image.dtype == np.uint8 and image.ndim == 3 and image.shape[2] in (3, 4) 637 ) 638 if supported_single_channel or supported_multichannel: 639 return np.array( 640 _pil_image(image).resize( 641 shape[::-1], resample=PIL.Image.Resampling.LANCZOS 642 ), 643 dtype=image.dtype, 644 ) 645 if image.ndim == 2: 646 # We convert to floating-point for resizing and convert back. 647 return to_type(resize_image(to_float01(image), shape), image.dtype) 648 # We resize each image channel individually. 649 return np.dstack( 650 [resize_image(channel, shape) for channel in np.moveaxis(image, -1, 0)] 651 )
Resizes image to specified spatial dimensions using a Lanczos filter.
Arguments:
- image: Array-like 2D or 3D object, where dtype is uint or floating-point.
- shape: 2D spatial dimensions (height, width) of output image.
Returns:
A resampled image whose spatial dimensions match
shape.
657def resize_video(video: Iterable[_NDArray], shape: tuple[int, int]) -> _NDArray: 658 """Resizes `video` to specified spatial dimensions using a Lanczos filter. 659 660 Args: 661 video: Iterable of images. 662 shape: 2D spatial dimensions (height, width) of output video. 663 664 Returns: 665 A resampled video whose spatial dimensions match `shape`. 666 """ 667 _check_2d_shape(shape) 668 return np.array([resize_image(image, shape) for image in video])
Resizes video to specified spatial dimensions using a Lanczos filter.
Arguments:
- video: Iterable of images.
- shape: 2D spatial dimensions (height, width) of output video.
Returns:
A resampled video whose spatial dimensions match
shape.
815def to_rgb( 816 array: _ArrayLike, 817 *, 818 vmin: float | None = None, 819 vmax: float | None = None, 820 cmap: str | Callable[[_ArrayLike], _NDArray] = 'gray', 821) -> _NDArray: 822 """Maps scalar values to RGB using value bounds and a color map. 823 824 Args: 825 array: Scalar values, with arbitrary shape. 826 vmin: Explicit min value for remapping; if None, it is obtained as the 827 minimum finite value of `array`. 828 vmax: Explicit max value for remapping; if None, it is obtained as the 829 maximum finite value of `array`. 830 cmap: A `pyplot` color map or callable, to map from 1D value to 3D or 4D 831 color. 832 833 Returns: 834 A new array in which each element is affinely mapped from [vmin, vmax] 835 to [0.0, 1.0] and then color-mapped. 836 """ 837 a = _as_valid_media_array(array) 838 del array 839 # For future numpy version 1.7.0: 840 # vmin = np.amin(a, where=np.isfinite(a)) if vmin is None else vmin 841 # vmax = np.amax(a, where=np.isfinite(a)) if vmax is None else vmax 842 vmin = np.amin(np.where(np.isfinite(a), a, np.inf)) if vmin is None else vmin 843 vmax = np.amax(np.where(np.isfinite(a), a, -np.inf)) if vmax is None else vmax 844 a = (a.astype('float') - vmin) / (vmax - vmin + np.finfo(float).eps) 845 if isinstance(cmap, str): 846 if hasattr(matplotlib, 'colormaps'): 847 rgb_from_scalar: Any = matplotlib.colormaps[cmap] # Newer version. 848 else: 849 rgb_from_scalar = matplotlib.pyplot.cm.get_cmap(cmap) # pylint: disable=no-member 850 else: 851 rgb_from_scalar = cmap 852 a = cast(_NDArray, rgb_from_scalar(a)) 853 # If there is a fully opaque alpha channel, remove it. 854 if a.shape[-1] == 4 and np.all(to_float01(a[..., 3])) == 1.0: 855 a = a[..., :3] 856 return a
Maps scalar values to RGB using value bounds and a color map.
Arguments:
- array: Scalar values, with arbitrary shape.
- vmin: Explicit min value for remapping; if None, it is obtained as the
minimum finite value of
array. - vmax: Explicit max value for remapping; if None, it is obtained as the
maximum finite value of
array. - cmap: A
pyplotcolor map or callable, to map from 1D value to 3D or 4D color.
Returns:
A new array in which each element is affinely mapped from [vmin, vmax] to [0.0, 1.0] and then color-mapped.
376def to_type(array: _ArrayLike, dtype: _DTypeLike) -> _NDArray: 377 """Returns media array converted to specified type. 378 379 A "media array" is one in which the dtype is either a floating-point type 380 (np.float32 or np.float64) or an unsigned integer type. The array values are 381 assumed to lie in the range [0.0, 1.0] for floating-point values, and in the 382 full range for unsigned integers, e.g. [0, 255] for np.uint8. 383 384 Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to 385 1.0. The input array may also be of type bool, whereby True maps to 386 uint(MAX) or 1.0. The values are scaled and clamped as appropriate during 387 type conversions. 388 389 Args: 390 array: Input array-like object (floating-point, unsigned int, or bool). 391 dtype: Desired output type (floating-point or unsigned int). 392 393 Returns: 394 Array `a` if it is already of the specified dtype, else a converted array. 395 """ 396 a = np.asarray(array) 397 dtype = np.dtype(dtype) 398 del array 399 if a.dtype != bool: 400 _as_valid_media_type(a.dtype) # Verify that 'a' has a valid dtype. 401 if a.dtype == bool: 402 result = a.astype(dtype) 403 if np.issubdtype(dtype, np.unsignedinteger): 404 result = result * dtype.type(np.iinfo(dtype).max) 405 elif a.dtype == dtype: 406 result = a 407 elif np.issubdtype(dtype, np.unsignedinteger): 408 if np.issubdtype(a.dtype, np.unsignedinteger): 409 src_max: float = np.iinfo(a.dtype).max 410 else: 411 a = np.clip(a, 0.0, 1.0) 412 src_max = 1.0 413 dst_max = np.iinfo(dtype).max 414 if dst_max <= np.iinfo(np.uint16).max: 415 scale = np.array(dst_max / src_max, dtype=np.float32) 416 result = (a * scale + 0.5).astype(dtype) 417 elif dst_max <= np.iinfo(np.uint32).max: 418 result = (a.astype(np.float64) * (dst_max / src_max) + 0.5).astype(dtype) 419 else: 420 # https://stackoverflow.com/a/66306123/ 421 a = a.astype(np.float64) * (dst_max / src_max) + 0.5 422 dst = np.atleast_1d(a) 423 values_too_large = dst >= np.float64(dst_max) 424 with np.errstate(invalid='ignore'): 425 dst = dst.astype(dtype) 426 dst[values_too_large] = dst_max 427 result = dst if a.ndim > 0 else dst[0] 428 else: 429 assert np.issubdtype(dtype, np.floating) 430 result = a.astype(dtype) 431 if np.issubdtype(a.dtype, np.unsignedinteger): 432 result = result / dtype.type(np.iinfo(a.dtype).max) 433 return result
Returns media array converted to specified type.
A "media array" is one in which the dtype is either a floating-point type (np.float32 or np.float64) or an unsigned integer type. The array values are assumed to lie in the range [0.0, 1.0] for floating-point values, and in the full range for unsigned integers, e.g. [0, 255] for np.uint8.
Conversion between integers and floats maps uint(0) to 0.0 and uint(MAX) to 1.0. The input array may also be of type bool, whereby True maps to uint(MAX) or 1.0. The values are scaled and clamped as appropriate during type conversions.
Arguments:
- array: Input array-like object (floating-point, unsigned int, or bool).
- dtype: Desired output type (floating-point or unsigned int).
Returns:
Array
aif it is already of the specified dtype, else a converted array.
436def to_float01(a: _ArrayLike, dtype: _DTypeLike = np.float32) -> _NDArray: 437 """If array has unsigned integers, rescales them to the range [0.0, 1.0]. 438 439 Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0. See 440 `to_type`. 441 442 Args: 443 a: Input array. 444 dtype: Desired floating-point type if rescaling occurs. 445 446 Returns: 447 A new array of dtype values in the range [0.0, 1.0] if the input array `a` 448 contains unsigned integers; otherwise, array `a` is returned unchanged. 449 """ 450 a = np.asarray(a) 451 dtype = np.dtype(dtype) 452 if not np.issubdtype(dtype, np.floating): 453 raise ValueError(f'Type {dtype} is not floating-point.') 454 if np.issubdtype(a.dtype, np.floating): 455 return a 456 return to_type(a, dtype)
If array has unsigned integers, rescales them to the range [0.0, 1.0].
Scaling is such that uint(0) maps to 0.0 and uint(MAX) maps to 1.0. See
to_type.
Arguments:
- a: Input array.
- dtype: Desired floating-point type if rescaling occurs.
Returns:
A new array of dtype values in the range [0.0, 1.0] if the input array
acontains unsigned integers; otherwise, arrayais returned unchanged.
459def to_uint8(a: _ArrayLike) -> _NDArray: 460 """Returns array converted to uint8 values; see `to_type`.""" 461 return to_type(a, np.uint8)
Returns array converted to uint8 values; see to_type.
329def set_output_height(num_pixels: int) -> None: 330 """Overrides the height of the current output cell, if using Colab.""" 331 try: 332 # We want to fail gracefully for non-Colab IPython notebooks. 333 output = importlib.import_module('google.colab.output') 334 s = f'google.colab.output.setIframeHeight("{num_pixels}px")' 335 output.eval_js(s) 336 except (ModuleNotFoundError, AttributeError): 337 pass
Overrides the height of the current output cell, if using Colab.
340def set_max_output_height(num_pixels: int) -> None: 341 """Sets the maximum height of the current output cell, if using Colab.""" 342 try: 343 # We want to fail gracefully for non-Colab IPython notebooks. 344 output = importlib.import_module('google.colab.output') 345 s = ( 346 'google.colab.output.setIframeHeight(' 347 f'0, true, {{maxHeight: {num_pixels}}})' 348 ) 349 output.eval_js(s) 350 except (ModuleNotFoundError, AttributeError): 351 pass
Sets the maximum height of the current output cell, if using Colab.
467def color_ramp( 468 shape: tuple[int, int] = (64, 64), *, dtype: _DTypeLike = np.float32 469) -> _NDArray: 470 """Returns an image of a red-green color gradient. 471 472 This is useful for quick experimentation and testing. See also 473 `moving_circle` to generate a sample video. 474 475 Args: 476 shape: 2D spatial dimensions (height, width) of generated image. 477 dtype: Type (uint or floating) of resulting pixel values. 478 """ 479 _check_2d_shape(shape) 480 dtype = _as_valid_media_type(dtype) 481 yx = (np.moveaxis(np.indices(shape), 0, -1) + 0.5) / shape 482 image = np.insert(yx, 2, 0.0, axis=-1) 483 return to_type(image, dtype)
Returns an image of a red-green color gradient.
This is useful for quick experimentation and testing. See also
moving_circle to generate a sample video.
Arguments:
- shape: 2D spatial dimensions (height, width) of generated image.
- dtype: Type (uint or floating) of resulting pixel values.
486def moving_circle( 487 shape: tuple[int, int] = (256, 256), 488 num_images: int = 10, 489 *, 490 dtype: _DTypeLike = np.float32, 491) -> _NDArray: 492 """Returns a video of a circle moving in front of a color ramp. 493 494 This is useful for quick experimentation and testing. See also `color_ramp` 495 to generate a sample image. 496 497 >>> show_video(moving_circle((480, 640), 60), fps=60) 498 499 Args: 500 shape: 2D spatial dimensions (height, width) of generated video. 501 num_images: Number of video frames. 502 dtype: Type (uint or floating) of resulting pixel values. 503 """ 504 _check_2d_shape(shape) 505 dtype = np.dtype(dtype) 506 507 def generate_image(image_index: int) -> _NDArray: 508 """Returns a video frame image.""" 509 image = color_ramp(shape, dtype=dtype) 510 yx = np.moveaxis(np.indices(shape), 0, -1) 511 center = shape[0] * 0.6, shape[1] * (image_index + 0.5) / num_images 512 radius_squared = (min(shape) * 0.1) ** 2 513 inside = np.sum((yx - center) ** 2, axis=-1) < radius_squared 514 white_circle_color = 1.0, 1.0, 1.0 515 if np.issubdtype(dtype, np.unsignedinteger): 516 white_circle_color = to_type([white_circle_color], dtype)[0] 517 image[inside] = white_circle_color 518 return image 519 520 return np.array([generate_image(i) for i in range(num_images)])
Returns a video of a circle moving in front of a color ramp.
This is useful for quick experimentation and testing. See also color_ramp
to generate a sample image.
>>> show_video(moving_circle((480, 640), 60), fps=60)
Arguments:
- shape: 2D spatial dimensions (height, width) of generated video.
- num_images: Number of video frames.
- dtype: Type (uint or floating) of resulting pixel values.
736class set_show_save_dir: # pylint: disable=invalid-name 737 """Save all titled output from `show_*()` calls into files. 738 739 If the specified `directory` is not None, all titled images and videos 740 displayed by `show_image`, `show_images`, `show_video`, and `show_videos` are 741 also saved as files within the directory. 742 743 It can be used either to set the state or as a context manager: 744 745 >>> set_show_save_dir('/tmp') 746 >>> show_image(color_ramp(), title='image1') # Creates /tmp/image1.png. 747 >>> show_video(moving_circle(), title='video2') # Creates /tmp/video2.mp4. 748 >>> set_show_save_dir(None) 749 750 >>> with set_show_save_dir('/tmp'): 751 ... show_image(color_ramp(), title='image1') # Creates /tmp/image1.png. 752 ... show_video(moving_circle(), title='video2') # Creates /tmp/video2.mp4. 753 """ 754 755 def __init__(self, directory: _Path | None): 756 self._old_show_save_dir = _config.show_save_dir 757 _config.show_save_dir = directory 758 759 def __enter__(self) -> None: 760 pass 761 762 def __exit__(self, *_: Any) -> None: 763 _config.show_save_dir = self._old_show_save_dir
Save all titled output from show_*() calls into files.
If the specified directory is not None, all titled images and videos
displayed by show_image, show_images, show_video, and show_videos are
also saved as files within the directory.
It can be used either to set the state or as a context manager:
>>> set_show_save_dir('/tmp')
>>> show_image(color_ramp(), title='image1') # Creates /tmp/image1.png.
>>> show_video(moving_circle(), title='video2') # Creates /tmp/video2.mp4.
>>> set_show_save_dir(None)
>>> with set_show_save_dir('/tmp'):
... show_image(color_ramp(), title='image1') # Creates /tmp/image1.png.
... show_video(moving_circle(), title='video2') # Creates /tmp/video2.mp4.
315def set_ffmpeg(name_or_path: _Path) -> None: 316 """Specifies the name or path for the `ffmpeg` external program. 317 318 The `ffmpeg` program is required for compressing and decompressing video. 319 (It is used in `read_video`, `write_video`, `show_video`, `show_videos`, 320 etc.) 321 322 Args: 323 name_or_path: Either a filename within a directory of `os.environ['PATH']` 324 or a filepath. The default setting is 'ffmpeg'. 325 """ 326 _config.ffmpeg_name_or_path = name_or_path
Specifies the name or path for the ffmpeg external program.
The ffmpeg program is required for compressing and decompressing video.
(It is used in read_video, write_video, show_video, show_videos,
etc.)
Arguments:
- name_or_path: Either a filename within a directory of
os.environ['PATH']or a filepath. The default setting is 'ffmpeg'.
1257def video_is_available() -> bool: 1258 """Returns True if the program `ffmpeg` is found. 1259 1260 See also `set_ffmpeg`. 1261 """ 1262 return _search_for_ffmpeg_path() is not None
Returns True if the program ffmpeg is found.
See also set_ffmpeg.