To describe a C++ API in Python CLIF uses an Interface Description Language (IDL) (that is a modified PYTD language) described below.
CLIF has a C++ API description in the .clif file:
# 1. Load FileStat wrapper class from another CLIF wrapper.
from "file/base/python/filestat_clif.h" import * # FileStat
# 2. Load Status from hand-written C++ CLIF extension library.
from "util/task/python/clif.h" import * # Status
# 3. Load Options protobuf from generated C++ CLIF extension library.
from "file/base/options_pyclif.h" import *
# 4. Load pure-Python postprocessor function for Status
from devtools.clif.python.postproc import DropOkStatus
# 5. Load pure-Python types for CLIF wrapper (for improved PyType, when
# available).
from file.base.python.filestat import FileStat
# 6. Load pure-Python types for generated Python proto library (for improved
# PyType, when available).
from file.base.options_pb2 import Options
from "file/base/filesystem.h":
namespace `file`:
def ForEachMatch(pattern: str, options: Options,
match_handler: (filename:str, fs:FileStat)->bool
) -> Status:
return DropOkStatus(...)
# define other API items here (if needed)
Line 1 gets a class wrapped by another CLIF module. Line 2 gets a
custom wrap for Status
and StatusOr
.
Line 3 gets a wrapped option.proto (generated by pyclif_proto_library
BUILD
rule).
Note: Callback signature above matches std::function<bool (StringPiece,
file:Stats)>
.
From that example we see that .clif file has 2 sections:
Preparation specifies which CLIF extension libraries are needed and what C++ library we are wrapping. It can have [c header import][cimport], [python import][pyimport], [namespace][namespace] and [use][use] statements.
API description starts with from statement that points to the C++ header file we wrap and has an indented block describing the API. That block might have the following statements:
which are described below.
NOTE: Some features are “experimental” which means they can be changed or removed in the future releases.
The c header import statement makes types wrapped by another CLIF rule or by a C++ CLIF extension library available to use in this .clif file. Such library can be written by hand or generated by a tool (like CLIF protobuf wrapper - it generates a cc_library CLIF extension.)
from "cpp/include/path/to/aCLIF/extension/library.h" import *
Note that c header import requires a double-quoted string exactly as the
C++ #include
directive.
Use c header import statement to inform CLIF about wrapped C++ types that needs to be available in the module being wrapped.
If you don’t want to pollute .clif namespace with all names from that header, you can prefix imported names with a variant of include statement:
from "some/header.h" import * as prefix_name
Now all CLIF types defined in the header.h
(with CLIF use `ctype` as
clif_type
) will be available as prefix_name.clif_type
.
The python import statement is a normal Python import to make a library symbol available within the .clif file. Only a single symbol import allowed (not a module). All imports must be absolute.
from path.to.project.library.module import SomeClassOrFunction
This statement is typically used to load a Python [postprocessing function][postprocessing].
The general form of the OPTION statement is:
OPTION name = value
However, currently the only available OPTION is:
OPTION is_extended_from_python = True
This OPTION is important when wrapped C++ types are extended from Python
(go/clif-primer#py_library_wrapper), which involves a private module, i.e. a
module with a name
that has a leading underscore (see
[Wrapping a C++ library][WrappingACppLibrary] above), and a matching
py_library
, for example:
py_clif_cc(name="_mylib")
py_library(name="mylib")
The is_extended_from_python
OPTION controls which of these is imported from
other py_clif_cc
modules, for example:
py_clif_cc(name="myapp", clif_deps=["_mylib"], py_deps=["mylib"])
With OPTION is_extended_from_python = True
, the PyCLIF-generated myapp
module will never import _mylib
directly, but always import mylib
. This
ensures that all Python-side customizations are applied.
The from statement tells CLIF what library file to wrap. This statement allows top-level API name lookup in any namespace in the specified file.
from "cpp/include/path/to/some/library.h":
# API description statements
The namespace statement tells CLIF what C++ namespace to use (backquotes are required around the C++ name). That namespace must be declared in the from‘d file. This statement limits top-level API name lookup to the specified namespace.
from "cpp/include/path/to/some/library.h":
namespace `my::namespace`:
def Name() # API description statements
WARNING: Namespace statements can’t be nested.
The def statement describes a C++ function (or member function).
def NAME ( INPUT_PARAMETERS ) OUTPUT_PARAMETERS
It has three main parts: the name, input parameters and output parameters.
NAME can be a simple alphanumeric name when we want it to be the same in C++ and Python. In some cases we want or need to rename the C++ name to have a different name in the Python wrapper. In those cases rename construct can be used:
`cplusplus_name` as python_name
For example `size` as __len__
1 or `pass` as pass_
. Such renaming
can occur everywhere a NAME is used.
INPUT_PARAMETERS describes values to be converted from Python and passed to
the C++ function. It is a (potentially empty) comma-separated list of
name:type
pairs, ie. x:int, descriptive_name:str
. Both name
and type
are
required (Only self
in class methods has no type.) For a type you use a Python
standard type. Python containers should also be typed (like list<int>
or
dict<bytes, int>
).
Tip: If C++ has a default argument (ie. with = value
clause), it can also be
optional in PYTD. Just add =default
to its name:type
specification.
OUTPUT_PARAMETERS are more complex:
-> type
, or-> (name1:type1, name2:type2, ...)
like
input parameters.By Google
convention
C++ signature should have all input parameters before any output parameter(s).
The first output parameter is the function return value and others are listed
after inputs as C++ TYPE*
(pointer to output type). CLIF does not allow you to
violate those conventions. To circumvent that restriction write a helper C++
function and wrap it instead.
For example:
C++ function | described as |
---|---|
void F() | def F() |
int F() | def F() -> int |
void F(int) | def F(name_is_mandatory: int) |
int F(int) | def F(name_is_mandatory: int) -> int |
int F(string*) | def F() -> (code: int, message: str) |
Parameter / Return Value Type | Ownership |
---|---|
std::unique_ptr |
transferred |
std::shared_ptr |
shared |
const T& | create a copy |
T& | create a copy |
raw pointer | borrowed |
C++ functions with output parameters or return values of type std::unique_ptr
transfer object ownership to Python, std::shared_ptr
shares ownership between
C++ and Python, while const T&
and T&
are copied.
C++ functions with std::unique_ptr
input parameters transfer ownership to C++,
std::shared_ptr
shares ownership between C++ and Python, while const T&
, or
T&
are copied.
Raw pointers are always assumed to be borrowed.
If a different convention was used, one can create a wrapper to implement the
desired behavior. If compatible overloaded functions exists, CLIF will prefer
the std::unique_ptr
alternative.
None
is converted to nullptr
and vice versa in many but not all situations.
However, ideally we’d change this behavior some day, by enforcing that None
is
accepted or returned only if NoneOr<>
(or something similar) appears
explicitly in the .clif
file (note that today, NoneOr<>
only works with
std::optional
, not pointers).
Often C/C++ APIs return a status as one return value. Python users prefer to not see a good status at all and get an exception on a bad status. To get that behavior, CLIF supports Python postprocessor functions that will take return value(s) from C++ and transform them.
The standard CLIF library comes with the following postprocessor functions:
ValueErrorOnFalse
takes first return value as bool, drops it from output
if True or raise a ValueError if it’s False.ValueErrorOnNone
raises a ValueError if any of the return values are None,
corresponding to nullptr
assignments.chr
is a Python built-in function useful to convert int/uint8 (from C++
char) to a Python 1-character string.To use a postprocessor function you must first import it2 with a
[python import][pyimport] statement but remember to import the proper Python
name, not just the module. And use the extended def
syntax as shown below:
def NAME ( INPUT_PARAMETERS ) OUTPUT_PARAMETERS:
return PostProcessorFunction(...)
where ...
are three dots verbatim for all OUTPUT_PARAMETERS to be passed as
args to the PostProcessorFunction.
The Python interpreter uses a Global Interpreter Lock (GIL) to serialize accesses to its internal structures. When executing C++ code, it is generally useful to release this GIL so that other threads can acquire it and execute Python code.
Asynchronous execution can take advantage of multiple cores if the C++ code does disk or network IO, or executes CPU intensive computations. It is also important to release the GIL when calling blocking functions, to avoid deadlock conditions between this GIL and another C++ lock.
CLIF will release the GIL on every function call, except for:
@do_not_release_gil
object
type.You can implement C++ virtual function in Python. To do so derive a Python class from the CLIF-wrapped C++ class and just define the member function with the proper name.
To allow a Python implementation of a derived class to be called from C++ (via a
pointer to the base class) mark the function with a @virtual
decorator.
Do not decorate C++ virtual methods with @virtual unless you need to implement them in Python.
The
Python special methods
have double underscores in their names (__dunder__
) and by default expose the
corresponding C++ overloaded operator. When the Python API require them to
return self
, use -> self
in the signature. Otherwise match the C++ signature
and CLIF will try to conform to the Python API.
C++ implements operators inside or outside of the class (aka member and
non-member operators). Keep such class API description Pythonic, CLIF will find
the non-member operator implementation by itself. You can even use non-member
function as-if they were class members, but they should take the class instance
(this
) as the first parameter.
For example:
struct Key {
// If declared as friend here, it must also be defined or declared outside
// the class.
friend bool operator==(const Key &a, const Key &b);
}
// Declaration here (perhaps in a header file). Definition can appear elsewhere
// (perhaps in a .cc file).
bool operator==(const Key& a, const Key& b);
// Or you can provide an inline definition (for example in a header file), but
// it must be outside the friending class.
inline bool operator==(const Key& a, const Key& b) {
// ...
}
class Key
def __eq__(self, other: Key) -> bool
To use a wrapped C++ class as a Python
context manager,
some methods must be wrapped as __enter__
and __exit__
. However Python has a
different calling convention. To help wrap such cases use CLIF method decorators
to force the Python API:
@__enter__
to call the wrapped method on __enter__
and return self
as
the context manager instance, and@__exit__
to take the required
PEP-343 (type, value,
traceback) args on __exit__
, call the wrapped method with no arguments,
and return None.However if the C++ method provides the Python-needed API it can be simply renamed:
def `c_implementation_of_exit` as __exit__(self,
type: object, value: object, trace: object) -> bool
WARNING: Be careful, when you use object
CLIF assumes you know what you’re
doing.
The const statement describes a C++ global or member constant (static const or constexpr).
const NAME: TYPE
It also makes sense to rename the constant to make it Python-style conformant:
const `kNumTries` as NUM_TRIES: int
The enum statement describes a C++ enum or enum class. This form will take all enum values under the same names as they are known to C++.
enum NAME
It also makes sense to rename enum values to match expected Python style:
enum NAME with:
`kDefault` as DEFAULT
`kOptionOne` as OPTION_ONE
C++ enums will be presented as Python Enum
or IntEnum
3 classes from
the standard enum
module [backported to Python 2.7]4.
The class statement describes a C++ struct or class. It must have an indented block describing what class members are wrapped. That block can have all the statements that the [from][from] block has and a [var][var] statement for member variables.
Each member method should have a specific first argument:
self
for regular C++ member functionscls
for static C++ member functionsThe first argument (self/cls) should not have any type as the type is implicit (it’s the class that the function is a member of).
Also static member functions should have @classmethod
decorator or moved to
the module level with a [staticmethods][staticmethods] statement.
class MyClass:
def __init__(self, param: int, another_param: float)
def Method(self) -> dict<int, int>
@classmethod
def StaticMethod(cls, param: int) -> MyClass
TIP: Always use Python module-level functions for exposing class static member functions unless you have a very good reason not to.
The above snippet is better written as:
class MyClass:
def __init__(self, param: int, another_param: float)
def Method(self) -> dict<int, int>
staticmethods from `MyClass`:
def StaticMethod(param: int) -> MyClass
CLIF inheritance specification does not need to follow the C++ inheritance
relationship. Only specify the base class if it is important for the Python API.
CLIF is capable of figuring out C++ inheritance details even if the .clif
file
does not explicitly list them.
If the C++ class has no parent, no parent should be in the CLIF specification. If the C++ class has a parent but it’s of no interest to a Python user, the parent also should be omitted and relevant parent methods should be listed in the child class CLIF specification.
class Parent {
public:
void Something() = 0;
void SomethingInteresting();
};
class Child : public Parent {
public:
void Useful();
};
A CLIF specification for that might look like the following.
class Child:
def SomethingInteresting(self)
def Useful(self)
If the parent C++ class is already wrapped in another .clif file, use a Python-style import to define it as a base class, for example:
from full.path.to.another.python.wrapper import Parent
from "cpp/include/path/to/child.h":
class Child(Parent):
Note that Python-style imports only enable defining base classes. An additional C-style import is needed if a parent C++ class also appears as a return type or argument type, for example:
from "full/path/to/another/python/wrapper_clif.h" import *
from full.path.to.another.python.wrapper import Parent
from "cpp/include/path/to/child.h":
class Child(Parent): # Needs the Python-style import.
def SomeMethod(self) -> Parent # Needs the C-style import.
Multiple inheritance in CLIF declaration is prohibited, but the C++ class being wrapped may have multiple parents according to the [Google C++ Style Guide] (https://google.github.io/styleguide/cppguide.html#Multiple_Inheritance).
When wrapping a class don’t forget to describe its constructor (unless the
default C++ constructor suffice, then def __init__(self)
is redundant). Note
that Python does not have function overloading, so a class can have only one
constructor. Select the most useful one to expose as the class constructor.
class Foo {
public:
Foo();
Foo(int special);
};
The default constructor will be unavailable if you define another constructor for Python like the following.
class Foo:
def __init__(self, special: int)
Additional C++ constructors can be exposed by using the @add__init__
decorator. This will create a Python static method in the class as an
alternative constructor.
class Foo:
def __init__(self, special: int)
@add__init__
def Default(self) # wraps Foo::Foo() constructor as Foo.Default()
When you need to wrap several instantiations of a template class, you may skip repeating the template API in each class wrapper by using the interface.
The interface statement describes the C++ template class API, so that instantiations can simply refer to it.
To declare the API use interface
instead of class
. The names in <>
are the
template parameters and will be replaced with actual typenames during the class
instantiation.
interface ProtoCache<Query, QueryResponse>:
size_bytes: int = property(`size_bytes`)
def Get(self, key: Query) -> (found: bool, val: QueryResponse):
return ValueErrorOnFalse(...)
def Put(self, key: Query, val: QueryResponse)
def Clear(self)
To consume the API use the implements statement in the class, providing actual typenames for the interface parameters.
from "py/proto_cache.h":
class SampleBatchQueryProtoCache:
implements ProtoCache<SampleBatchQuery, SampleBatchQueryResponse>
class RunInfoQueryProtoCache:
implements ProtoCache<RunInfoQuery, RunInfoQueryResponse>
Currently the template class has to be defined in the same header file as the instantiations.
A C++ class with a std::iterator
compatible implementation can be iterable in
Python.
To declare the class iterable include a nested class with the Python name
__iter__
:
class I_Want_This_Class_To_Be_Iterable:
class `iterator` as __iter__:
def __next__(self) -> int
The __iter__
class must declare exactly one method __next__
that returns the
type that *iterator
has (typically named the value_type
in C++).
TIP: Usually you want to wrap a const_iterator
.
The var statement describes a C++ public member variable.
Note that var is the only statement that has no keyword.
NAME: TYPE
A variable can be any addressable member of C++ class/struct that is not static.
To circumvent this restriction use property
described below.
Python receives a copy of a C++ variable value on each attribute access. This is counterintuitive to how most people think about Python as such an attribute access is not a simple reference.
In case of containers, updating that copy without reassigning it back into the the class variable will not change the class variable value.
myclass.tags.append("manual") # Does not update wrapped myclass.tags!
# To update it, the assignment must be explicit:
myclass.tags += ["manual"]
To remind the user about the copy instead of letting them incorrectly assume
that an attribute access is a reference, you might want to use @getter
(and
@setter
) function decorators to declare Python methods to get (and set) the
C++ variable instead of exposing an attribute. That can be thought of as the
reverse of the property
feature seen below. Both the getter and setter must
use the C++ variable name as the C++ name of the function.
For example the following C++ class
struct Stat {
struct Options {
int length;
};
Options opt;
};
can be wrapped as
class Stat:
class Options:
length: int
@getter
def `opt` as get_options(self) -> Options
@setter
def `opt` as set_options(self, o: Options)
If a C++ class has getters and setters, consider using them as Python property rather than calling getters and setters as functions from Python. Direct access to instance variables is more Pythonic and makes programs more readable.
NAME: TYPE = property(`getter`, `setter`)
The getter is a C++ function returning TYPE (TYPE getter();
) and the
setter is a C++ function taking TYPE (void setter(TYPE);
). To have a
read-only property just use only the getter.
The var statement is most useful in describing plain C structs. If we have a struct with mostly data members, it can be described as
from "file/base/fileproperties_pyclif.h" import *
from file.base.fileproperties_pb2 import FileProperties
from "file/base/filestat.h":
class FileStat:
length: int
mtime: `time_t` as int
# ...
properties: FileProperties = property(`file_properties`)
def IsDirectory(self) -> bool
def Clear(self)
The staticmethods statement facilitates wrapping class static member functions. It has a nested block that can only contain def statements. Like the namespace statement, this statement puts a limit where CLIF can find the function, ie. search only inside the named class.
from "some/path/my_library.h"
staticmethods from `Foo`:
def Bar()
def Baz()
In that example Foo::Bar
and Foo::Baz
must be static members of class Foo
and will be wrapped as module-level functions some.path.my_library.Bar
and
some.path.my_library.Baz
.
TIP: The C++ class name can be fully qualified.
The pass statement allows you to wrap a C++ class without any API. It has two use cases:
capsule
with memory management”, ie. allow instance destruction if it
was owned by Python.from "some/file.h":
class Base:
def SomeApi(self)
class Derived(Base):
pass
In that example Derived has the same API as Base, ie. SomeApi() but may have a different C++ implementation which is useful for testing.
The use statement reassigns a default C++ type for a given CLIF type:
use `std::string` as str
This statement is rarely needed. See more on types below.
CLIF uses Python types in API descriptions (.clif files). Generally it’s CLIF’s
job to find the corresponding C++ types automatically. However, it is common
that multiple C++ types are converted to the same Python type, e.g. C++
std::unordered_set
and std::set
are both converted to the Python set
type.
In such situations only one of the conversions will work implicitly
(this is a limitation of the implementation), while all others need to be
specified explictly, e.g.:
C++:
void pass_unordered_set_int(const std::unordered_set<int>& values);
std::unordered_set<int> return_unordered_set_int();
void pass_set_int(const std::set<int>& values);
std::set<int> return_set_int();
.clif:
def pass_unordered_set_int(values: set<int>)
def return_unordered_set_int() -> set<int>
def pass_set_int(values: `std::set` as set<int>)
def return_set_int() -> `std::set` as set<int>
What works implicitly can be customized with the [use][use] statement.
The syntax for nested types is, e.g.:
C++:
void pass_set_list_int(const std::set<std::list<int>>& clusters);
.clif:
def pass_set_list_int(clusters: `std::set` as set<`std::list` as list<int>>)
Note that the backtick syntax also works for simpler types, e.g.:
C++:
void pass_size_t(std::size_t value);
.clif:
def pass_size_t(value: `std::size_t` as int)
However, in most cases the simpler
def pass_size_t(value: int)
will also work, if there is an implicit C++ conversion
(in this example between std::size_t
and int
).
NOTE: CLIF will reject unknown types and produce an error. It can be parse-time error for CLIF types or compile-time error for C++ types.
CLIF knows some basic types (predefined via clif/python/types.h
) including:
Default C++ type | CLIF type5 |
---|---|
int |
int |
string |
bytes or str |
bool |
bool |
double |
float |
complex<> |
complex |
vector<> |
list<> |
pair<> |
tuple<> |
unordered_set<> |
set<> |
unordered_map<> |
dict<> |
std::function<R(T, U)> |
(t: T, u: U) -> R |
PyObject* |
object 6 |
Note: Default in the header row above means that the C++ type does not have
to be specified explicitly in .clif files (unless a use
statement
changes the default).
CLIF also knows how to handle various other types including:
C++ type | CLIF type |
---|---|
[u]intXX_t (e.g. int8_t ) |
int |
float |
float |
map |
dict |
set |
set |
list , array , stack |
list |
deque , queue , priority_queue |
list |
const char* (as return value only) |
str (bytes is not supported) |
Please note that we want the C++ API to be explicit and while C++ does not distinguish between bytes and unicode, Python does. It means that Python .clif files must specify what exact type (bytes or unicode) the C++ code expects or produces.
However, CLIF
always takes Python unicode and implicitly encodes it using UTF-8 for C++. To
get unicode back to Python 2, use unicode
as the return datatype. In Python 3,
str
gets converted to unicode automatically.
That can be summarized as below.
CLIF type | On input | On output CLIF returns |
---|---|---|
bytes | (*) | bytes |
str | (*) | native str |
unicode | (*) | unicode |
(*) CLIF will take bytes or unicode Python object and pass [UTF-8 encoded] data to C++.
UTF-8 encoding assumed on C++ side.
When exposing a C++ function as __len__
make sure it only returns a
non-negative numbers or Python will raise a SystemError
. ↩
Except chr
that is already ‘imported’ by CLIF. ↩
C++ 11 class enum
converted to Enum
, old-style enum
to IntEnum
. ↩
https://pypi.python.org/pypi/enum34 ↩
CLIF types named after the corresponding Python types. ↩
Be careful when you use object
, CLIF assumes you know what
you’re doing with Python C API and all its caveats. ↩