A static type analyzer for Python code
Home
Developer guide
Workflow
• Development process
• Python version upgrades
• Supporting new features
Program analysis
• Bytecode
• Directives
• Main loop
• Stack frames
• Typegraph
Data representation
• Abstract values
• Attributes
• Overlays
• Special builtins
• Type annotations
• Type stubs
• TypeVars
Configuration
Style guide
Tools
Documentation debugging
View the Project on GitHub google/pytype
Hosted on GitHub Pages — Theme by orderedlist
Pytype’s high-level workflow to analyse a single file1 is:
run_program
3Run the bytecode5
For each frame6
Loop over opcodes, updating state[^run-instruction]
state = run_instruction(op, state)
Call analyze
7
run_program
and analyze
to infer type signatures for
all classes, methods and functions.run_instruction
is the central dispatch point for opcode analysis. For every
opcode, OP
, we have a corresponding byte_OP()
method; run_instruction
looks this method up, calls it with the current state and the opcode, and uses
the return value as the new state.
TIP: If you want to get a feel for how pytype works, an excellent
starting point is to look at some of the byte_*
methods and see how they
mirror the workings of the python interpreter at a type level, popping arguments
off the stack, manipulating locals
and globals
dictionaries, and creating
objects for classes, methods and functions.
Pytype performs two passes when analyzing a file, as mentioned in the workflow above.
The first pass starts with run_program()
, which executes the bytecode of the
Python program using pytype’s virtual machine. This first step compiles the
source code, executes the bytecode and builds the typegraph for the program.
Besides regular type errors, this step also checks for errors such as:
However, this step will only find errors in functions and classes that are part
of the control flow graph, starting with the main function of the file. If a
function or class is not reachable from main()
, this pass will miss errors in
that member. If the file doesn’t have a main()
– i.e. it is a library – then
no class or function bodies will be type checked.
Because of that, pytype uses the typegraph to run a second analysis pass by
calling analyze()
. This pass recursively type checks all members of the
program, starting at the top level definitions. These are mostly classes, though
some libraries define top-level functions.
Both passes will be performed, no matter if pytype is run in “inference” (-o
)
or “check” (-C
) mode. The second pass can be disabled using the --main
(or
-m
) debug option, in which case only the code that is reachable from main()
will be analyzed.
io.py : process_one_file()
↩
tracer_vm.py : class CallTracer
↩
vm.py: run_program()
↩
vm.py: compile_src()
↩
vm.py: run_bytecode()
↩
A frame is a segment of code, typically one method or function. See
state.py
[^run-instruction]: vm.py: run_instruction()
[^compute-types]: analyze.py: compute_types()
↩
analyze.py: analyze()
↩