Optimization Passes List

This is an automatically generated list of all optimization passes available with the default opt_main. This is generated automatically based on comments in the header files.

If the opt level is set below 'Min opt level' the pass will act as a no-op.

If the opt level is set above 'Cap opt level' the pass (or passes within the compound pass) will be executed with the opt level capped to the specified value.

Warning: Many of these passes have descriptions generated by the Gemini LLM and may not accurately reflect the behavior of the passes. As time goes on manual verification or editing of the pass descriptions may be done to improve accuracy.

default_pipeline - The default pipeline.

Text-proto

arith_simp - Arithmetic Simplifications

This pass performs various arithmetic optimizations such as replacement of divide by a constant with non-divide operations.

Note: What follows was generated by the Gemini LLM. Not human verified.

The ArithSimplificationPass is an optimization pass in XLS that applies a wide range of arithmetic transformations to simplify expressions, reduce hardware complexity, and improve overall efficiency. It utilizes a StatelessQueryEngine to identify constant values and bit patterns, which enables many of its powerful transformations. This pass operates in an iterative fixed-point manner, meaning it repeatedly applies simplifications until no further changes can be made, ensuring a thorough optimization.

Key optimizations performed by this pass include:

Division and Modulo by Constants:

* Division by Power of Two: udiv(x, 2^K) or sdiv(x, 2^K) is replaced with a right or left shift by K bits, respectively. This is a fundamental hardware optimization.

* Modulo by Power of Two: umod(x, 2^K) is replaced with a bit slice to extract the K least significant bits. For smod(x, 2^K), it's handled by a select operation based on the sign of x to ensure correct signed modulo semantics.

* Division/Modulo by Other Constants (Unsigned): udiv(x, K) where K is a constant (not a power of two) is transformed using "magic multiplication" algorithms, which replace division with a combination of shifts and multiplies. This is often significantly more efficient in hardware than a dedicated division unit.

* Division/Modulo by Other Constants (Signed): sdiv(x, K) where K is a constant, also employs magic multiplication, with additional logic to correctly handle signed semantics and rounding towards zero.

* Constant Dividend Optimization: If the dividend is a small constant (e.g., udiv(13, x) or sdiv(13, x)), the operation is replaced by a lookup table (PrioritySelect over a Literal array). This is particularly effective for small constant dividends where a full division circuit is overkill.
Shift Operations:

* Logical Shift by Constant: shll(x, K) or shrl(x, K) where K is a constant, is replaced by a combination of bit_slice and concat operations. This avoids generating a potentially larger barrel shifter for fixed-amount shifts.
```
  ```
  // Original: shrl(x, literal(2, bits=N_shift_amount_width))
  // Example for shift by 2
  // Optimized: concat(literal(0, bits=2),
                       bit_slice(x, start=2, width=N-2))
  ```
```
* Arithmetic Shift Right by Constant: shra(x, K) is replaced by a sign_extend of a bit_slice of x, effectively performing an arithmetic right shift without a barrel shifter.

* Removal of Zero-Extended Shift Amounts: If a shift amount is the result of a concat with leading zeros (a common pattern for zero-extension), the leading zeros are removed from the shift amount, simplifying the shift input.

* Removal of Shift Guards: If a shift amount is clamped by a select operation (e.g., select(UGt(amt, LIMIT), {LIMIT, amt})) where LIMIT is greater than or equal to the bit width of the value being shifted, the clamping logic is removed as it is redundant (any shift amount beyond the bit width produces the same result).
Comparisons with Negated Operands:

* eq(-lhs, -rhs) => eq(lhs, rhs) (and similar for ne).

* Signed comparisons like slt(-lhs, -rhs): These are transformed into an xor of the reversed comparison (sgt(lhs, rhs)) and additional terms to correctly handle MIN_VALUE edge cases (where neg(MIN_VALUE) == MIN_VALUE).

* Signed comparison of negation to literal eq(-expr, K): These are transformed into eq(expr, -K), with similar xor logic for inequalities to correctly handle MIN_VALUE. These transformations are only applied if the negate operation has no other users to avoid introducing logic replication.
Multiply Operations:

* Multiply by Power of Two: umul(x, 2^K) or smul(x, 2^K) are replaced with a left shift by K bits.

* Multiply by Small Constant (Sum-of-Shifts): mul(x, K) where K is a small constant with a few set bits can be replaced by a sum of shifted x values (e.g., x * 5 => x + (x << 2)). This is applied if the number of adders required is below a kAdderLimit. A complementary optimization uses subtraction from a power-of-two shift for constants like 7 (e.g., x * 7 => (x << 3) - x).

* Zero-Width Multiply Operands: Multiplies where one or both operands have a zero bit width are replaced by a literal zero, as the result will always be zero.

* Multiply used by Narrowing Slice: If a multiply operation's result is only used by bit_slice operations that extract a narrower result, the multiply itself can be performed at the required output width, reducing the size of the multiplier hardware.
Decode Operations:

* 1-bit Wide Decode: A decode operation with a 1-bit output is simplified to an eq(index, 0) comparison, as only the index 0 will produce a true output.

* 1-bit Operand Decode: A decode operation with a 1-bit input and a 2-bit output is simplified to concat(operand, not(operand)), which is its direct one-hot representation.

* Removal of Zero-Extended Index: Similar to shifts, if a decode index is formed by a concat with leading zeros (zero-extension), the leading zeros are removed from the index.
decode(N) - 1 Pattern: The common pattern for creating a mask with the N least-significant bits set (e.g., sub(decode(N), 1)) is rewritten to not(shll(all_ones, N)), eliminating an adder and often resulting in more efficient hardware.
Comparison of Injective Operations: The pass identifies patterns where the result of an injective arithmetic operation (like add or sub) is compared against a constant (e.g., (X + C0) == C1). These are simplified by performing the inverse arithmetic operation on the constants to isolate X (e.g., X == C1 - C0).

The ArithSimplificationPass operates to a fixed point, meaning it repeatedly applies these simplifications until no further changes are possible. This iterative approach ensures that all possible arithmetic optimizations are applied, resulting in a more efficient and optimized IR for hardware synthesis.

Optimization Passes List

default_pipeline - The default pipeline.

Invoked Passes

arith_simp - Arithmetic Simplifications

array_simp - Array Simplification

array_untuple - Array UnTuple

basic_simp - Basic Simplifications

bdd_cse - BDD-based Common Subexpression Elimination

bdd_simp - BDD-based Simplification

bdd_simp(2) - BDD-based Simplification with opt_level <= 2

bdd_simp(3) - BDD-based Simplification with opt_level <= 3

bitslice_simp - Bit-slice simplification

bool_simp - boolean simplification

canon - Canonicalization

channel_legalization - Legalize multiple send/recvs per channel

comparison_simp - Comparison Simplification

concat_simp - Concat simplification

cond_spec(Bdd) - Conditional specialization

cond_spec(false) - Conditional specialization

cond_spec(noBdd) - Conditional specialization

cond_spec(true) - Conditional specialization

const_fold - Constant folding

cse - Common subexpression elimination

dataflow - Dataflow Optimization

dce - Dead Code Elimination

dfe - Dead Function Elimination

fixedpoint_proc_state_flattening - Proc State Flattening

Options Set

Invoked Passes

fixedpoint_simp - Fixed-point Simplification

Options Set

Invoked Passes

fixedpoint_simp(2) - Max-2 Fixed-point Simplification

Options Set

Invoked Passes

fixedpoint_simp(3) - Max-3 Fixed-point Simplification

Options Set

Invoked Passes

fixedpoint_simp(>=1,<=2) - Min-1 Max-2 Fixedpoint Simplification

Options Set

Invoked Passes

full-inlining - full function inlining passes

Invoked Passes

ident_remove - Identity Removal

inlining - Inlines invocations

label-recovery - LabelRecovery

leaf-inlining - Inlines invocations

loop_unroll - Unroll counted loops

lut_conversion - LUT Conversion

map_inlining - Inline map operations

narrow - Narrowing

narrow(Context) - Narrowing

narrow(OptionalContext) - Narrowing

narrow(Range) - Narrowing

narrow(Ternary) - Narrowing

next_value_opt - Next Value Optimization

non_synth_separation - Non-Synthesizable Separation

one-leaf-inlining - leaf function inlining passes

Invoked Passes

post-inlining - Post-inlining passes

Invoked Passes

post-inlining-opt - post-inlining optimization passes

Invoked Passes

post-inlining-opt(>=1) - min-1 post-inlining optimization passes

Options Set

Invoked Passes

pre-inlining - pre-inlining passes

Invoked Passes

proc_state_array_flat - Proc State Array Flattening

proc_state_bits_shatter - Proc State Bits Shattering

proc_state_narrow - Proc State Narrowing

proc_state_opt - Proc State Optimization

proc_state_provenance_narrow - Proc State Provenance Narrowing

proc_state_tuple_flat - Proc State Tuple Flattening

ram_rewrite - RAM Rewrite

reassociation - Reassociation

recv_default - Receive default value simplification

resource_sharing - Resource Sharing

scheduling-opt - scheduling opt passes

Options Set