Pytype Tools

Pytype Tools
- Introduction
- analyze_project
- merge_pyi
- traces
- annotate_ast
- xref
- Utilities

Introduction

The pytype/tools/ subdirectory consists of a number of independent tools which build upon the idea of “pytype as a library”. They are included as part of pytype both for convenience, and as examples of projects that depend on pytype.

NOTE: These tools typically have both a binary and one or more library modules; the latter can be considered public, and are safe for third party code to use as dependencies.

analyze_project

The main pytype binary, pytype-single, runs pytype over a single file (and expects all imports to have a corresponding pyi file available). analyze_project extracts a dependency graph from a directory of python code, and runs pytype-single over each of its files in the correct order, so that a pyi file is generated for all a file’s imports before the file itself is processed.

NOTE: Since this is what users expect by default, the analyze_project binary is named pytype and is exposed as the main user-facing entry point, though from a code perspective the “main” project is pytype-single

merge_pyi

merge_pyi takes a python source file and the corresponding pyi signature file (either generated by pytype or hand-written), and attempts to insert type annotations from the pyi file as inline type annotations in the python source.

merge_pyi is written as a lib2to3 fixer, and does not depend on pytype, though running pytype first is a convenient way to generate the input pyi file.

traces

traces is a library of utilities for type-aware python source code analysis. During pytype’s execution, we collect “opcode traces”, snapshots of the type information pytype infers for each opcode’s arguments. The traces library provides tools for working with these opcode traces, as well as joining them to the source text and the parsed syntax tree to perform various code analysis and transformation tasks.

Some of the main functionality the library provides:

source.Code: A class that holds a set of raw traces and a source file, and has various options to query the joined data by line number ranges.
visitor.BaseVisitor: A base class for visitors on any tree that conforms to the python ast module’s interface.
traces.trace: A function that runs pytype over a source file, generates traces, and returns a source.Code object with the traces and the source file.
traces.MatchAstVisitor: An AST visitor that matches opcode traces with AST nodes, using position and symbol information from both sets of data.

annotate_ast

annotate_ast is a library that adds type information to the nodes of a python AST. Internally, it uses traces.MatchAstVisitor to find opcode traces for each node, and copies the type information from the trace to node.resolved_type and node.resolved_annotation.

annotate_ast is a useful starting point to build type-aware linting and refactoring tools; these tools typically work via AST analysis and transformation, and can be made much more precise if the AST nodes include type information.

xref

xref is an indexer and cross-reference generator for python projects. The indexer is built around a ScopedVisitor, an AST visitor that keeps track of nested scopes, and an Env, an analogue of a python symbol table. It replicates python’s symbol lookup mechanism, and uses the traces library to associate type information with symbols.

The indexer matches opcode traces from pytype with every AST node. Joining in this data allows us to determine:

Whether a symbol is a definition or a reference to an existing symbol
If it is a definition, what python type is it defining or instantiating
If it is a reference, what python type is the object it refers to
If we have an attribute access, what class and method definition can we associate it with

Note that some of this information is very hard to determine purely from a lexical analysis of the source code, without adding in type inference.

xref emits cross-reference data in kythe format; however, the kythe graph generation is cleanly separated from the indexing code, and other output formats should be easy to add if desired.

Utilities

The tools/ directory also contains some utilities that are generally useful when writing pytype-based tools:

arg_parser.py: Argument parsing that forwards arguments to pytype-single
config.py: Config file reader
environment.py: Set up and validate an environment to run pytype
runner.py: Running subprocesses (typically used to help run pytype-single)