Pytype is built around a “shadow bytecode interpreter”, which traces through a program’s bytecode, mimicking the effects of the cpython interpreter but tracking types rather than values.
A good starting point is to trace through the details of pytype’s main loop and get a feel for how the bytecode interpreter works.
As pytype analyzes a program, it builds a control flow graph (CFG) that represents how the parts of the program work together. Each Node in the CFG roughly correlates with a single statement in the program.
if some_val: # 1 x = 5 # 2 y = 6 # 3 else: x = "a" # 4 z = x.upper() + str(y) # 5
This program has a CFG that looks like:
(1) | | (2)<-+ +->(4) | | v | (3)---+----+ | v (5)
Note how the two branches of the if-else statement are represented by two paths starting at Node 1 and coming together at Node 5.
A Variable tracks the type information for a variable in the program being
analyzed. This includes simple variables (e.g.
x = 5), function
def f(a, b)), and functions, classes and modules.
A Binding associates a Variable with a value at a particular Node. In the
example above, the Variable for
x is bound to the value
5 at Node 2
Binding(5, Node 2)) and to
"a" at Node 4. (
Binding("a", Node 4)).
y has only a single
Binding(6, Node 3).
Building up the CFG in this way allows pytype to perform type checking. When
pytype reaches Node 5 (
z = x.upper()), it queries the CFG to find what
be. Depending on the value of
x could an
int or a
int doesn’t have a method called
upper, pytype reports an
str does have an
However, pytype is limited by what it knows. Looking at the example again, we
y won’t be defined if
False, which would make
str(y) fail. But pytype can’t know for sure if
some_val will evaluate to
False. Since there’s a path through the CFG where
y is defined
y = 6 if
some_val == True), pytype won’t report an error for
we change the condition to
if False, so that pytype knows unambiguously that
only the code under
else: will be executed, then pytype will report a
str(y) because there is no path through the CFG where
pytd node, abstract value, conversion both ways