koladata

Home
Overview
Fundamentals
Glossary
Cheatsheet
API Reference
Quick Recipes
Deep Dive
Common Pitfalls and Gotchas
Persistent Storage

View the Project on GitHub google/koladata

kd.testing API

A front-end module for kd.testing.*.

kd.testing.assert_allclose(actual_value: DataSlice, expected_value: DataSlice, *, rtol: float | None = None, atol: float = 0.0)

Koda variant of NumPy's allclose predicate.

See the NumPy documentation for numpy.testing.assert_allclose.

The main difference from the numpy is that assert_allclose works with Koda
DataSlice(s) and checks that actual_value and expected_value have close values
under the hood.

It also supports sparse array types.

Args:
  actual_value: DataSlice.
  expected_value: DataSlice.
  rtol: Relative tolerance.
  atol: Absolute tolerance.

Raises:
  AssertionError: If actual_value and expected_value values are not close up
    to the given tolerance or shape and DataBag are not equivalent and their
    check was requested.

kd.testing.assert_dicts_keys_equal(dicts: DataSlice, expected_keys: DataSlice)

Koda check for Dict keys equality.

Koda Dict keys are stored and returned in arbitrary order. When they are also
not-flat, it is difficult to compare them using other assertion primitives.

This assertion verifies dicts.get_keys() and expected_keys have the same
shapes, schemas and that their contents have the same values and their count.

NOTE: This assertion method ignores DataBag(s) associated with the inputs.

Args:
  dicts: DataSlice.
  expected_keys: DataSlice.

Raises:
  AssertionError: If dicts.get_keys() and expected_keys cannot represent the
    keys of the same dict.

kd.testing.assert_dicts_values_equal(dicts: DataSlice, expected_values: DataSlice)

Koda check for Dict values equality.

Koda Dict values are stored and returned in arbitrary order. When they are
also not-flat, it is difficult to compare them using other assertion
primitives.

This assertion verifies dicts.get_values() and expected_values have the same
shapes, schemas and that their contents have the same values and their count.

NOTE: This assertion method ignores DataBag(s) associated with the inputs.

Args:
  dicts: DataSlice.
  expected_values: DataSlice.

Raises:
  AssertionError: If dicts.get_values() and expected_values cannot represent
    the values of the same dict.

kd.testing.assert_equal(actual_value: DataBag | DataSlice | JaggedShape | QValue | Slice | Expr | None, expected_value: DataBag | DataSlice | JaggedShape | QValue | Slice | Expr | None, *, msg: str | None = None) -> None

Koda equality check.

Compares the argument by their fingerprint:
* 2 DataSlice(s) are equal if their contents and JaggedShape(s) are
  equal / equivalent and they reference the same DataBag instance.
* 2 DataBag(s) are equal if they are the same DataBag instance.
* 2 JaggedShape(s) are equal if they have the same number of dimensions and
  all "sizes" in each dimension are equal.

NOTE: For JaggedShape equality and equivalence are the same thing.

Args:
  actual_value: DataSlice, DataBag or JaggedShape.
  expected_value: DataSlice, DataBag or JaggedShape.
  msg: A custom error message.

Raises:
  AssertionError: If actual_qvalue and expected_qvalue are not equal.

kd.testing.assert_equivalent(actual_value: DataBag | DataSlice | JaggedShape | QValue | Slice | Expr | None, expected_value: DataBag | DataSlice | JaggedShape | QValue | Slice | Expr | None, *, partial: bool | None = None, ids_equality: bool | None = None, schemas_equality: bool | None = None, msg: str | None = None)

Koda equivalency check.

* 2 DataSlice(s) are equivalent if their contents and JaggedShape(s) are
  equivalent and their DataBag(s) have the same contents (including the
  distribution of data in fallback DataBag(s)).
* 2 DataBag(s) are equivalent if their contents are the same (including the
  distribution of data in fallback DataBag(s).
* 2 JaggedShape(s) are equivalent if they are equal, i.e. if sizes / edges
  across all their dimensions are the same.

Args:
  actual_value: DataSlice, DataBag or JaggedShape.
  expected_value: DataSlice, DataBag or JaggedShape.
  partial: (default: False) Whether to check only the attributes present in
    the expected_value (affects only DataSlice case).
  ids_equality: (default: False) Whether to check ids equality (affects only
    DataSlice case).
  schemas_equality: (default: True) Whether to check schema ids equality
    (affects only DataSlice case).
  msg: A custom error message.

Raises:
  AssertionError: If actual_value.fingerprint and expected_value.fingerprint
    are not equal.

kd.testing.assert_non_deterministic_exprs_equal(actual_expr: Expr, expected_expr: Expr)

Koda check for Expr equality that accounts for non-deterministic Expr(s).

Args:
  actual_expr: Expr.
  expected_expr: Expr.

Raises:
  AssertionError: If actual_expr and expected_expr do not represent equal Koda
    expressions modulo non-deterministic property.

kd.testing.assert_not_equal(actual_value: DataBag | DataSlice | JaggedShape | QValue | Slice | Expr | None, expected_value: DataBag | DataSlice | JaggedShape | QValue | Slice | Expr | None, *, msg: str | None = None) -> None

Koda inequality check.

Compares the argument by their fingerprint:
* 2 DataSlice(s) are equal if their contents and JaggedShape(s) are
  equal / equivalent and they reference the same DataBag instance.
* 2 DataBag(s) are equal if they are the same DataBag instance.
* 2 JaggedShape(s) are equal if they have the same number of dimensions and
  all "sizes" in each dimension are equal.

NOTE: For JaggedShape equality and equivalence are the same thing.

Args:
  actual_value: DataSlice, DataBag or JaggedShape.
  expected_value: DataSlice, DataBag or JaggedShape.
  msg: A custom error message.

Raises:
  AssertionError: If actual_qvalue and expected_qvalue are equal.

kd.testing.assert_traced_exprs_equal(actual_expr: Expr, expected_expr: Expr)

Asserts that exprs are equal, skipping annotations added during tracing.

kd.testing.assert_traced_non_deterministic_exprs_equal(actual_expr: Expr, expected_expr: Expr)

Asserts that exprs are equal, skipping non-determinism and annotations added during tracing.

kd.testing.assert_unordered_equal(actual_value: DataSlice, expected_value: DataSlice)

Checks DataSlices are equal ignoring the ordering in the last dimension.

This assertion verifies actual_value and expected_value have the same
shapes, schemas, dbs and that their items in the last dimensions are equal
ignoring the order.

Args:
  actual_value: DataSlice.
  expected_value: DataSlice.

Raises:
  AssertionError: If DataSlices are not equal.