XLS Fuzzer

To execute the XLS fuzz driver simply run a command line like the following:

bazel run -c opt \
  //xls/fuzzer:run_fuzz_multiprocess \
  -- --crash_path=/tmp/crashers-$(date +'%Y-%m-%d') --seed=0 --duration=8h

Note

The --seed=0 flag makes the fuzzer run from a deterministic seed, so the same sequence of examples will be tested each time command line invocation. To run non-deterministically, do not provide the --seed flag.

The XLS fuzzer generates a sequence of randomly generated DSLX functions and a set of random inputs to each function often with interesting bit patterns.

Given that stimulus, the fuzz driver performs the following actions some of which may be disabled/enabled via flags (run with --help for more details):

Runs the DSLX program through the DSLX interpreter with the batch of arguments
Converts the DSLX program to IR
Optimizes the converted IR
Interprets the pre-optimized and optimized IR with the batch of arguments
Generates the Verilog from the IR with randomly selected codegen options (with --codegen)
Simulates the generated Verilog using the batch of arguments (with --simulate)
Performs a multi-way comparison of the DSLX interpreter results, the pre-optimized IR interpreter results, post-optimized IR interpreter results, and the simulator results
If an issue is observed, the fuzz driver attempts to minimize the IR that causes an issue to occur.

The above actions are coordinated and run by the SampleRunner class. Many actions are performed by invoking a separate binary which isolates any crashes.

When miscompares in results occur or the generated function crashes part of XLS, all artifacts generated by the fuzzer for that sample are written into a uniquely-named subdirectory under the --crash_path given in the command line. The fuzzer also writes a crasher file which is a single file for reproducing the issue. See below for instructions on debugging a failing sample.

Crashers Directory

The crashers directory includes a subdirectory created for each failing sample. To avoid collisions the subdirectory is named using a hash of the DSLX code. Each crasher subdirectory has the following contents:

$ ls /tmp/crashers-2024-09-19/433f5244
args.txt                        run.sh
codegen_main.stderr             sample.block.ir
crasher_2024-09-19_433f.x       sample.ir
eval_ir_main.stderr             sample.ir.results
exception.txt                   sample.opt.ir
ir_converter_main.stderr        sample.opt.ir.results
minimized.ir                    sample.v
module_sig.textproto            sample.x
options.pbtxt                   sample.x.results
opt_main.stderr                 simulate_module_main.stderr
revision.txt

The directory includes the problematic DSLX sample (sample.x) and the input arguments (args.txt) as well as all artifacts generated and stderr output emitted by the various utilities invoked to test the sample. Notable files include:

options.pbtxt : Options used to run the sample (text protobuffer).
sample.ir : Unoptimized IR generated from the DSLX sample.
sample.opt.ir : IR after optimizations.
sample.v : Generated Verilog (or sample.sv with --use_system_verilog=true on fuzz runner)
*.results : The results (numeric values) produced by interpreting or simulating the respective input (DSLX, IR, or Verilog).
exception.txt : The exception raised when running the sample. Typically this will indicate either a result miscomparison or a tool return non-zero status (for example, the IR optimizer crashed).
crasher_*.x: A single file reproducer which includes the DSLX code, arguments, and options. See below for details.
run.sh: a script to re-run this example.

Typically the exact nature of the failure can be identified by reading the file exception.txt and possibly the stderr outputs of the various tools.

The fuzzer can optionally produce a minimized IR reproduction of the problem. This will be written to minimized.ir. See below for details.

Single-file reproducers

When the fuzzer encounters an issue it will create a single-file reproducer:

--- Worker 14 observed an exception, noting
--- Worker 14 noted crasher #1 for sampleno 42 at /tmp/crashers/095fb405

Copying that file to the directory //xls/fuzzer/crashers will automatically create a bazel test target for it and add it to the regression suite. Tests can also be added as known failures in //xls/fuzzer/build_defs.bzl as they're being triaged / investigated like so:

generate_crasher_regression_tests(
    srcs = glob(["crashers/*"]),
    prefix = "xls/fuzzer",
    # TODO(xls-team): 2019-06-30 Triage and fix these.
    failing = [
        "crashers/crasher_2019-06-29_129987.x",
        "crashers/crasher_2019-06-29_402110.x",
    ],
)

Known-failures are marked as manual and excluded from continuous testing.

To run the regression suite:

bazel test //xls/fuzzer:all

To run the regression suite including known-failures, run the regression target directly:

bazel test //xls/fuzzer:regression_tests

To reproduce from that single-file reproducer there is a command line tool:

bazel run //xls/fuzzer:run_crasher -- \
  crasher_2019-06-26_3354.x

IR minimization

By default the fuzzer attempts to generate a minimal IR reproducer for the problem identified by the DSLX sample. Starting with the unoptimized IR the fuzzer invokes ir_minimizer_main to reduce the size of the input IR. It uses various simplification strategies to minimize the number of nodes in the IR. See the usage description in the tool source code for detailed information.

The minimized IR is written to a file minimized.ir in the crasher directory for the sample. Note that minimization is only possible if the problem (crash, result miscomparison, etc.) occurs after conversion from DSLX to XLS IR.

Summaries

To monitor progress of the fuzzer and to determine op coverage the fuzzer can optionally (with --summary_path) write summary information to files. The summary files are Protobuf files containing the proto SampleSummaryProto defined in //xls/fuzzer/sample_summary.proto. The summary information about the IR generated from the DSLX sample such as the number and type of each IR op as well as the bit width and number of operands.

The summary information also includes a timing breakdown of the various operations performed for each sample (sample generation, IR conversion, etc). This can be used to identify performance bottlenecks in the fuzzer.

The summaries can be read with the tool //xls/fuzzer/read_summary_main. See usage description in the code for more details.

Debugging a failing sample

A generated sample can fail in one of two ways: a tool crash or a result miscomparison. A tool crash occurred if one of the tools invoked by the fuzzer (e.g., opt_main which optimizes the IR) returned a non-zero status. A result miscomparison occurred if there is not perfect correspondence between the results produced by various ways in which the generated function is evaluated:

Interpreted DSLX
Evaluated unoptimized IR
Evaluated optimized IR
Simulation of the generated (System)Verilog

Generally, the results produced by the interpretation of the DSLX serves are the reference results for comparisons.

To identify the underlying cause of the sample failure inspect the exception.txt file in the crasher directory. The file contains the text of the exception raised in SampleRunner which clearly identifies the kind of failure (result miscomparison or tool crash) and details about which evaluation resulted in a miscompare or which tool crashed, respectively. Consult the following sections on how to debug particular kinds of failures.

Debugging a tool crash

The exception.txt file includes the invocation of the tool for reproducing the failure. Generally, this is a straightforward debugging process.

If the failing tool is the IR optimizer binary opt_main the particular pass causing the failure should be in the backtrace. To retrieve the input to this pass, run opt_main with --ir_dump_path to dump the IR between each pass. The last IR file produced (the files are numbered sequentially) is the input to the failing pass.

Result miscomparison: unoptimized IR

The evaluation of the unoptimized IR is the first point at which result comparison occurs (DSLX interpretation versus unoptimized IR evaluation). A miscomparison here can indicate a bug in one of several places:

DSLX interpreter
DSLX to IR conversion
IR interpreter or IR JIT. The error message in exception.txt indicates whether the JIT or the interpreter was used.

To help narrow this down, the IR interpreter can be compared against the JIT with the eval_ir_main tool:

  eval_ir_main --test_llvm_jit --input_file=args.txt sample.ir

This runs both the JIT and the interpreter on the unoptimized IR file (sample.ir) using the arguments in args.txt and compares the results. If this is successful, then likely the IR interpreter and the JIT are correct and problem lies earlier in the pipeline (DSLX interpretation or DSLX to IR conversion). Otherwise, there is definitely a bug in either the interpreter or the JIT as their results should always be equal.

If a minimized IR file exists (minimized.ir) this may be a better starting point for isolating the failure.

Result miscomparison: optimized IR

This can indicate a bug in IR evaluation (interpreter or JIT) or in the optimizer. In this case, a comparison of the evaluation of the unoptimized IR against the DSLX interpreter has already succeeds so DSLX interpretation or conversion is unlikely to be the underlying cause.

As with miscomparison involving the unoptimized IR, eval_ir_main can be used to compare the JIT results against the interpreter results:

  eval_ir_main --test_llvm_jit --input_file=args.txt sample.opt.ir

If the above invocation fails there is a bug in the JIT or the interpreter. Otherwise, there may be a bug in the optimizer. The tool eval_ir_main can help isolate the problematic optimization pass by running with the options --optimize_ir and --eval_after_each_pass. With these flags, the tool runs the optimization pipeline on the given IR and evaluates the IR after each pass is run. The first pass which results in a miscompare against the unoptimized input IR is flagged. Invocation:

  eval_ir_main --input_file=args.txt \
    --optimize_ir \
    --eval_after_each_pass \
    sample.ir

Debugging the LLVM JIT

To help isolate bugs in the JIT, LLVM's optimization level can be set using the --llvm_opt_level flag:

  eval_ir_main --test_llvm_jit \
    --llvm_opt_level=0 \
    --input_file=args.txt sample.opt.ir

If the results match (pass) with the optimization level set to zero but fail with the default optimization level of 3, there is likely a bug in the LLVM optimizer or the XLS-generated LLVM program has undefined behavior.

Unoptimized and optimized LLVM IR are dumped by the JIT with vlog level of 2 or higher, and the assembly is dumped at level 3 or higher. For example:

  eval_ir_main -v=3 --logtostderr --random_inputs=1 sample.opt.ir

Generating LLVM Artifacts

You can generate LLVM IR artifacts for a piece of ir code using dump_llvm_artifacts

This tool invokes the aot compiler to generate LLVM bytecode for a given ir file and (if possible) some additional files to enable one to run it with lli or similar tools.

Note

Currently this tool does not support block IRs since it is built on the AOT-compiler which only supports proc and functions.

Note: LLVM can change significantly and bytecode is not always compatible between versions. If possible, LLVM tools built at the same commit as the JIT should be used to interact with the generated llvm bytecode. This can be done by building the LLVM tools using bazel from the XLS repo.

If you have an example input-output pair:

$ bazel run jit:dump_llvm_artifacts -- --out_dir=/tmp/muladd --ir=/tmp/muladd.ir '--input=bits[8]:0x1' '--input=bits[8]:0x2' '--input=bits[8]:0x3' '--result=bits[8]:0x5'
Generating XLS artifacts
Generating main.cc
Compiling main.cc
Linking unopt main.cc
Linking opt main.cc
$ ls /tmp/muladd
linked.ll      main.ll                result.entrypoints.txtpb  result.opt.ll
linked.opt.ll  result.asm             result.ll
main.cc        result.entrypoints.pb  result.o
$ lli /tmp/muladd/linked.ll
$ echo $?
0

This generates a number of files.

result.ll: The unoptimized llvm ir that our JIT/aot produces
result.opt.ll: The optimized llvm ir that our JIT/aot actually compiles.
result.asm: Result of compiling the result.opt.ll as assembly
result.o: Result of compiling the result.opt.ll as object code
result.entrypoints.{txt,}pb: The AotPackageEntrypointsProto describing the compiled code in both text and binary format.
main.cc: A cc file which contains a main function that invokes the compiled code and compares it against the 'result'. It does not link against any xls code. It requires that the code not use trace or asserts.
main.ll: The llvm-ir version of main.cc.
linked.ll: A fully linked llvm ir version of main.ll linked with result.ll.
linked.opt.ll: A fully linked llvm ir version of main.ll linked with result.opt.ll.

If you have a proc or do not have an reproducer value.

$ bazel run //xls/jit:dump_llvm_artifacts -- \
        --out_dir=/some/path --ir=/path/to/test.ir

This generates the same files but lacks the main. and linked. files.

Building LLVM tools

The various LLVM tools such as opt and lli can be built with:

  bazel build /llvm/llvm-project/llvm:all

Build in fastbuild mode to get checks and debug features in LLVM.

Running the LLVM optimization passes

To run the LLVM IR optimizer run the following (starting with the unoptimized IR):

  opt sample.ll -O3 -S

To print the IR before and after each pass:

 opt sample.ll -S -print-after-all -print-before-all -O3

Running the Instcombine pass

Instcombine is an LLVM optimization pass which is a common source of bugs in code generated from XLS. To run instcombine alone:

 opt /tmp/bad.ll -S -passes=instcombine

Instcombine is a large monolithic pass and it can be difficult to isolate the exact transformation which caused the problem. Fortunately, this pass includes a "compiler fuel" option which can be used to limit the number of transformations performed by the pass. Example usage (fastbuild of LLVM is required):

opt -S --instcombine sample.ll --debug-counter=instcombine-visit-skip=0,instcombine-visit-count=42

Evaluating LLVM IR

The LLVM tool lli evaluates LLVM IR. The tool expects the IR to include a entry function main. This can be generated by eval_ir_main --llvm_jit_main_wrapper_output=<file> --llvm_jit_main_wrapper_write_is_linked. See the description in the tools page for how these flags work.

Once both the main wrapper bytecode file and the function bytecode files are created they can be linked to a single file using llvm-link:

llvm-link -S -o sample_linked.ll sample.ll sample_main.ll

The LLVM tool opt optimizes the LLVM IR and can be piped to lli like so:

  opt sample_linked.ll --O2 | lli

The LLVM IR can also compiled to an object file using llc and driven using a generated llvm test wrapper. The directory xls/fuzzer/debug includes a script and example demonstrating how to run JIT-generated LLVM IR in this manner.

Running LLVM code generation

If the bug occurs during LLVM code generation (lowering of LLVM IR to object code) the LLVM tool llc may be used to reproduce the problem. llc takes LLVM IR and produces assembly or object code. Example invocation for producing object code:

llc sample_linked.ll -o sample.o --filetype=obj

The exact output of llc depends on the target machine used during compilation. Logging in the OrcJit (at vlog level 1) will emit the exact llc invocation which uses the same target machine as the JIT.

Result miscomparison: simulated Verilog

This can be a bug in codegen, XLS's Verilog testbench code, or the Verilog simulator itself. Running the generated Verilog with different simulators can help isolate the problem:

  simulate_module_main --signature_file=module_sig.textproto \
    --args_file=args.txt \
    --verilog_simulator=iverilog \
    sample.v

  simulate_module_main --signature_file=module_sig.textproto \
    --args_file=args.txt \
    --verilog_simulator=${SIM_2} \
    sample.v

The tool outputs the results of the evaluation to stdout so diffing their outputs is required.

Filing an LLVM bug

If the fuzzer problem is due to a crash or miscompile by LLVM, file an LLVM bug here. Example LLVM bugs found by the fuzzer: 1, 2.

Although the internal Google mirror of LLVM is updated frequently, prior to filing an LLVM bug it's a good idea to verify the failure against LLVM head. Steps to build a debug build of LLVM:

git clone https://github.com/llvm/llvm-project.git
cd llvm-project
mkdir build
cd build
cmake -G Ninja ../llvm -DCMAKE_BUILD_TYPE=Debug -DLLVM_TARGETS_TO_BUILD=X86
cmake --build . -- opt # or llc or other target.

Below are instructions to configure LLVM with sanitizers enabled. This can be useful for reproducing issues found with the sanitizer-enabled fuzz tests.

# Install a version of LLVM which supports necessary sanitizer options.
sudo apt-get install lld-15 llvm-15 clang-15 libc++1-15
# In llvm-project directory:
mkdir build-asan
cd build-asan
TOOLBIN=/usr/lib/llvm-15/bin
# Below enables a particular sanitizer option `sanitize-float-cast-overflow`.
# `Address` can be used as an option instead of `Undefined` depending on the
# desired sanitizer check.
cmake ../llvm -GNinja -DCMAKE_BUILD_TYPE=RelWithDebInfo \
  -DCMAKE_CXX_COMPILER=$TOOLBIN/clang++ -DCMAKE_C_COMPILER=$TOOLBIN/clang \
  -DLLVM_USE_SANITIZER=Undefined \
  -DLLVM_UBSAN_FLAGS='-fsanitize=float-cast-overflow -fsanitize-undefined-trap-on-error' \
  -DLLVM_ENABLE_LLD=On -DLLVM_TARGETS_TO_BUILD=X86

LLVM includes a test case minimizer called bugpoint which tries to reduce the size of an LLVM IR test case. bugpoint has many options but it can operate in a similar manner to the XLS IR minimizer where a user-specified script is used to determine whether the bug exists in the LLVM IR:

bugpoint input.ll -compile-custom -compile-command bugpoint_test.sh

Example bugpoint test script (bugpoint_test.sh):

#!/bin/bash

# Create a temporary file for the test command
logfile="$(mktemp)"

# Run your test command (and redirect the output messages)
/path/to/llc "$@" -o /tmp/out.o -mcpu=skylake-avx512 --filetype=obj > "${logfile}" 2>&1
ret="$?"

# Print messages when error occurs
if [ "${ret}" != 0 ]; then
  echo "test failed"  # must print something on failure
  cat "${logfile}"
fi

# Cleanup the temporary file
rm "${logfile}"

exit "${ret}"