A static type analyzer for Python code
Home
Developer guide
Workflow
• Development process
• Python version upgrades
• Supporting new features
Program analysis
• Bytecode
• Directives
• Main loop
• Stack frames
• Typegraph
Data representation
• Abstract values
• Attributes
• Overlays
• Special builtins
• Type annotations
• Type stubs
• TypeVars
Configuration
Style guide
Tools
Documentation debugging
View the Project on GitHub google/pytype
Hosted on GitHub Pages — Theme by orderedlist
A type stub is a file with a .pyi
extension that describe a module’s types
while omitting implementation details. For example, if a module foo
has the
following source code:
class Foo:
CONSTANT = 42
def do_foo(x):
return x
then foo.pyi
would be:
from typing import TypeVar
T = TypeVar('T')
class Foo:
CONSTANT: int
def do_foo(x: T) -> T: ...
pytype allows an unannotated parameterized class’s contained type to be changed,
an operation we call a mutation, which .pyi files do not have a way of
expressing. Thus, pytype uses an extended pyi format, PyTypeDecl (“Python Type
Declaration”) or PyTD, in which mutations are described by assignment to
self
in a method body. For example, pytype’s builtins.pytd
types
dict.update
as:
class dict(Dict[_K, _V]):
def update(self, other: dict[_K2, _V2]) -> None:
self = dict[_K | _K2, _V | _V2]
In practice, the terms pyi
and pytd
are often used interchangeably.
pytype relies on the stubs provided by the open-source typeshed
project for most of its standard library and third party type information. For
modules for which accurate mutation information is important, we shadow the
typeshed stubs with custom pytd stubs located in
pytype/stubs/{builtins,stdlib}.
During analysis, pytype
will emit stubs of inferred type information for local files to communicate
between pytype-single
runs.
Import resolution is handled by the load_pytd.Loader class, which loads type stubs into a cached internal AST representation. The loader finds stubs for local files in one of two ways:
--imports_info
flag is passed, this flag point to a file that maps
all module paths to file paths, which is used to look up the import.The second approach is used by pytype-single
by default, but pytype’s
whole-project analysis tools always pass in --imports_info
for more reliable,
reproducible stub finding. The pytype GitHub project uses
importlab, another Google open-source project, to generate the
dependency graph from which imports_info is constructed.
As a concrete example, suppose a module bar
contains the following imports:
import dep1
from foo import dep2
pytype will first analyze dep1
and dep2
and write type stubs to an output
directory, say /home/.pytype/pyi/
. It will then construct this imports_info
file:
dep1 /home/.pytype/pyi/dep1.pyi
foo/dep2 /home/.pytype/pyi/foo/dep2.pyi
and use it for import lookup when analyzing bar
.
If an import can’t be resolved locally, pytype falls back to the standard library, then typeshed/third_party.
The following diagram shows a common import resolution path: the VM calls the loader, which finds the right file path and then parses the contents into an AST. The bolded methods are the entrypoints into the loader, which also happen to be the methods that do AST postprocessing and finalization.
The stub parser in pytype/pyi reads in a type stub and produces an
AST representation of its contents. It uses the stdlib ast
parser to convert
the type stub into a python AST (type stubs are required to parse as valid
python3), then generates a pytd tree from the AST.
The pytd nodes are defined in pytype/pytd/pytd.py as
immutable attrs
classes.
AST nodes are manipulated via a visitor interface. A visitor has Enter
and
Visit
methods for pre- and post-order traversal of the tree, as well as
Leave
methods for cleanup (more detail here). For example,
pytype.pytd.pytd_visitors.PrintVisitor
, which produces a string representation
of an AST, contains the following (simplified) logic for visiting a class:
def VisitClass(self, node):
bases_str = "(" + ", ".join(node.bases) + ")"
header = ["class " + node.name + bases_str + ":"]
method_lines = sum((m.splitlines() for m in node.methods), [])
methods = [" " + m for m in method_lines]
return "\n".join(header + methods) + "\n"
Nodes of type pytype.pytd.pytd.Class
are passed to this method, which returns
a string representation of its input. Note that, because the class’s children
have already been visited, they are strings by the time VisitClass
runs.
When pytype finishes analyzing a module that another module depends on, it
outputs a type stub for the former that the latter will use to resolve imports.
The pytype.output
module is responsible for converting the
abstract values produced by pytype’s shadow bytecode
interpreter into AST nodes.
NOTE: Conversely, pytype.convert
converts AST nodes into
abstract values.
The output
module contains two core methods: value_to_pytd_def
, which
converts an object to its definition (for example, InterpreterClass(Foo)
to
class Foo: ...
); and value_to_pytd_type
, which converts to the type
(InterpreterClass(Foo)
to Type[Foo]
). For convenience, these methods can be
accessed as to_pytd_def
and to_type
, respectively, on abstract values.
To decrease the size of emitted type stubs, pytype runs the visitors in
pytype.pytd.optimize
to simplify ASTs as much as
possible before serializing them. This also ensures that equivalent nodes are
reduced to a canonical form.
For example, suppose pytype is analyzing the following code, which describes a simple company structure:
class Employee:
pass
class CEO(Employee):
pass
class Company:
def __init__(self):
self.employees = [CEO()]
def hire(self, employee: Employee):
self.employees.append(employee)
pytype produces a raw pyi in which Company.employees
is inferred to be a list
of both CEOs and employees:
from typing import List
class Employee: ...
class CEO(Employee): ...
class Company:
employees: List[CEO | Employee]
def __init__(self) -> None: ...
def hire(self, employee: Employee) -> None: ...
The optimize.SimplifyUnionsWithSuperclasses
visitor then notices that a CEO is
a type of employee and simplifies the pyi to:
from typing import List
class Employee: ...
class CEO(Employee): ...
class Company:
employees: List[Employee]
def __init__(self) -> None: ...
def hire(self, employee: Employee) -> None: ...
For improved performance, pytype will read and write pickled pyi files instead of plaintext when run with:
--pickle-output --use-pickled-files --precompiled-builtins
The pytype.pytd.pytd_utils.{Load,Save}Pickle
methods can be
used to debug pickles.