clif

Python CLIF

To describe a C++ API in Python CLIF uses an Interface Description Language (IDL) (that is a modified PYTD language) described below.

Example

CLIF has a C++ API description in the .clif file:

# 1. Load FileStat wrapper class from another CLIF wrapper.
from "file/base/python/filestat_clif.h" import *  # FileStat
# 2. Load Status from hand-written C++ CLIF extension library.
from "util/task/python/clif.h" import *  # Status
# 3. Load Options protobuf from generated C++ CLIF extension library.
from  "file/base/options_pyclif.h" import *
# 4. Load pure-Python postprocessor function for Status
from devtools.clif.python.postproc import DropOkStatus
# 5. Load pure-Python types for CLIF wrapper (for improved PyType, when
#    available).
from file.base.python.filestat import FileStat
# 6. Load pure-Python types for generated Python proto library (for improved
#    PyType, when available).
from file.base.options_pb2 import Options

from "file/base/filesystem.h":
  namespace `file`:
    def ForEachMatch(pattern: str, options: Options,
                     match_handler: (filename:str, fs:FileStat)->bool
                    ) -> Status:
      return DropOkStatus(...)
    # define other API items here (if needed)

Line 1 gets a class wrapped by another CLIF module. Line 2 gets a custom wrap for Status and StatusOr. Line 3 gets a wrapped option.proto (generated by pyclif_proto_library BUILD rule).

Note: Callback signature above matches std::function<bool (StringPiece, file:Stats)>.

API specification language

From that example we see that .clif file has 2 sections:

Preparation specifies which CLIF extension libraries are needed and what C++ library we are wrapping. It can have [c header import][cimport], [python import][pyimport], [namespace][namespace] and [use][use] statements.

API description starts with from statement that points to the C++ header file we wrap and has an indented block describing the API. That block might have the following statements:

which are described below.

NOTE: Some features are “experimental” which means they can be changed or removed in the future releases.

c_header import statement

The c header import statement makes types wrapped by another CLIF rule or by a C++ CLIF extension library available to use in this .clif file. Such library can be written by hand or generated by a tool (like CLIF protobuf wrapper - it generates a cc_library CLIF extension.)

from "cpp/include/path/to/aCLIF/extension/library.h" import *

Note that c header import requires a double-quoted string exactly as the C++ #include directive.

Use c header import statement to inform CLIF about wrapped C++ types that needs to be available in the module being wrapped.

If you don’t want to pollute .clif namespace with all names from that header, you can prefix imported names with a variant of include statement:

from "some/header.h" import * as prefix_name

Now all CLIF types defined in the header.h (with CLIF use `ctype` as clif_type) will be available as prefix_name.clif_type.

python import statement

The python import statement is a normal Python import to make a library symbol available within the .clif file. Only a single symbol import allowed (not a module). All imports must be absolute.

from path.to.project.library.module import SomeClassOrFunction

This statement is typically used to load a Python [postprocessing function][postprocessing].

OPTION statement

The general form of the OPTION statement is:

OPTION name = value

However, currently the only available OPTION is:

OPTION is_extended_from_python = True

This OPTION is important when wrapped C++ types are extended from Python (go/clif-primer#py_library_wrapper), which involves a private module, i.e. a module with a name that has a leading underscore (see [Wrapping a C++ library][WrappingACppLibrary] above), and a matching py_library, for example:

py_clif_cc(name="_mylib")
py_library(name="mylib")

The is_extended_from_python OPTION controls which of these is imported from other py_clif_cc modules, for example:

py_clif_cc(name="myapp", clif_deps=["_mylib"], py_deps=["mylib"])

With OPTION is_extended_from_python = True, the PyCLIF-generated myapp module will never import _mylib directly, but always import mylib. This ensures that all Python-side customizations are applied.

from statement

The from statement tells CLIF what library file to wrap. This statement allows top-level API name lookup in any namespace in the specified file.

from "cpp/include/path/to/some/library.h":
  # API description statements

namespace statement

The namespace statement tells CLIF what C++ namespace to use (backquotes are required around the C++ name). That namespace must be declared in the from‘d file. This statement limits top-level API name lookup to the specified namespace.

from "cpp/include/path/to/some/library.h":
  namespace `my::namespace`:
    def Name()  # API description statements

WARNING: Namespace statements can’t be nested.

def statement

The def statement describes a C++ function (or member function).

def NAME ( INPUT_PARAMETERS ) OUTPUT_PARAMETERS

It has three main parts: the name, input parameters and output parameters.

NAME can be a simple alphanumeric name when we want it to be the same in C++ and Python. In some cases we want or need to rename the C++ name to have a different name in the Python wrapper. In those cases rename construct can be used:

`cplusplus_name` as python_name

For example `size` as __len__1 or `pass` as pass_. Such renaming can occur everywhere a NAME is used.

INPUT_PARAMETERS describes values to be converted from Python and passed to the C++ function. It is a (potentially empty) comma-separated list of name:type pairs, ie. x:int, descriptive_name:str. Both name and type are required (Only self in class methods has no type.) For a type you use a Python standard type. Python containers should also be typed (like list<int> or dict<bytes, int>).

Tip: If C++ has a default argument (ie. with = value clause), it can also be optional in PYTD. Just add =default to its name:type specification.

OUTPUT_PARAMETERS are more complex:

By Google convention C++ signature should have all input parameters before any output parameter(s). The first output parameter is the function return value and others are listed after inputs as C++ TYPE* (pointer to output type). CLIF does not allow you to violate those conventions. To circumvent that restriction write a helper C++ function and wrap it instead.

For example:

C++ function described as
void F() def F()
int F() def F() -> int
void F(int) def F(name_is_mandatory: int)
int F(int) def F(name_is_mandatory: int) -> int
int F(string*) def F() -> (code: int, message: str)

Pointers, references and object ownership

Parameter / Return Value Type Ownership
std::unique_ptr transferred
std::shared_ptr shared
const T& create a copy
T& create a copy
raw pointer borrowed

C++ functions with output parameters or return values of type std::unique_ptr transfer object ownership to Python, std::shared_ptr shares ownership between C++ and Python, while const T& and T& are copied.

C++ functions with std::unique_ptr input parameters transfer ownership to C++, std::shared_ptr shares ownership between C++ and Python, while const T&, or T& are copied.

Raw pointers are always assumed to be borrowed.

If a different convention was used, one can create a wrapper to implement the desired behavior. If compatible overloaded functions exists, CLIF will prefer the std::unique_ptr alternative.

None is converted to nullptr and vice versa in many but not all situations. However, ideally we’d change this behavior some day, by enforcing that None is accepted or returned only if NoneOr<> (or something similar) appears explicitly in the .clif file (note that today, NoneOr<> only works with std::optional, not pointers).

Postprocessing

Often C/C++ APIs return a status as one return value. Python users prefer to not see a good status at all and get an exception on a bad status. To get that behavior, CLIF supports Python postprocessor functions that will take return value(s) from C++ and transform them.

The standard CLIF library comes with the following postprocessor functions:

To use a postprocessor function you must first import it2 with a [python import][pyimport] statement but remember to import the proper Python name, not just the module. And use the extended def syntax as shown below:

def NAME ( INPUT_PARAMETERS ) OUTPUT_PARAMETERS:
  return PostProcessorFunction(...)

where ... are three dots verbatim for all OUTPUT_PARAMETERS to be passed as args to the PostProcessorFunction.

Asynchronous execution

The Python interpreter uses a Global Interpreter Lock (GIL) to serialize accesses to its internal structures. When executing C++ code, it is generally useful to release this GIL so that other threads can acquire it and execute Python code.

Asynchronous execution can take advantage of multiple cores if the C++ code does disk or network IO, or executes CPU intensive computations. It is also important to release the GIL when calling blocking functions, to avoid deadlock conditions between this GIL and another C++ lock.

CLIF will release the GIL on every function call, except for:

Implementing (virtual) methods in Python

You can implement C++ virtual function in Python. To do so derive a Python class from the CLIF-wrapped C++ class and just define the member function with the proper name.

To allow a Python implementation of a derived class to be called from C++ (via a pointer to the base class) mark the function with a @virtual decorator.

Do not decorate C++ virtual methods with @virtual unless you need to implement them in Python.

Implementing special Python methods

The Python special methods have double underscores in their names (__dunder__) and by default expose the corresponding C++ overloaded operator. When the Python API require them to return self, use -> self in the signature. Otherwise match the C++ signature and CLIF will try to conform to the Python API.

C++ implements operators inside or outside of the class (aka member and non-member operators). Keep such class API description Pythonic, CLIF will find the non-member operator implementation by itself. You can even use non-member function as-if they were class members, but they should take the class instance (this) as the first parameter.

For example:

struct Key {
  // If declared as friend here, it must also be defined or declared outside
  // the class.
  friend bool operator==(const Key &a, const Key &b);
}

// Declaration here (perhaps in a header file).  Definition can appear elsewhere
// (perhaps in a .cc file).
bool operator==(const Key& a, const Key& b);

// Or you can provide an inline definition (for example in a header file), but
// it must be outside the friending class.
inline bool operator==(const Key& a, const Key& b) {
  // ...
}
class Key
  def __eq__(self, other: Key) -> bool

Context manager

To use a wrapped C++ class as a Python context manager, some methods must be wrapped as __enter__ and __exit__. However Python has a different calling convention. To help wrap such cases use CLIF method decorators to force the Python API:

However if the C++ method provides the Python-needed API it can be simply renamed:

def `c_implementation_of_exit` as __exit__(self,
    type: object, value: object, trace: object) -> bool

WARNING: Be careful, when you use object CLIF assumes you know what you’re doing.

const statement

The const statement describes a C++ global or member constant (static const or constexpr).

const NAME: TYPE

It also makes sense to rename the constant to make it Python-style conformant:

const `kNumTries` as NUM_TRIES: int

enum statement

The enum statement describes a C++ enum or enum class. This form will take all enum values under the same names as they are known to C++.

enum NAME

It also makes sense to rename enum values to match expected Python style:

enum NAME with:
  `kDefault` as DEFAULT
  `kOptionOne` as OPTION_ONE

C++ enums will be presented as Python Enum or IntEnum3 classes from the standard enum module [backported to Python 2.7]4.

class statement

The class statement describes a C++ struct or class. It must have an indented block describing what class members are wrapped. That block can have all the statements that the [from][from] block has and a [var][var] statement for member variables.

Each member method should have a specific first argument:

The first argument (self/cls) should not have any type as the type is implicit (it’s the class that the function is a member of).

Also static member functions should have @classmethod decorator or moved to the module level with a [staticmethods][staticmethods] statement.

class MyClass:
  def __init__(self, param: int, another_param: float)
  def Method(self) -> dict<int, int>
  @classmethod
  def StaticMethod(cls, param: int) -> MyClass

TIP: Always use Python module-level functions for exposing class static member functions unless you have a very good reason not to.

The above snippet is better written as:

class MyClass:
  def __init__(self, param: int, another_param: float)
  def Method(self) -> dict<int, int>
staticmethods from `MyClass`:
  def StaticMethod(param: int) -> MyClass

Inheritance

CLIF inheritance specification does not need to follow the C++ inheritance relationship. Only specify the base class if it is important for the Python API. CLIF is capable of figuring out C++ inheritance details even if the .clif file does not explicitly list them.

If the C++ class has no parent, no parent should be in the CLIF specification. If the C++ class has a parent but it’s of no interest to a Python user, the parent also should be omitted and relevant parent methods should be listed in the child class CLIF specification.

class Parent {
 public:
  void Something() = 0;
  void SomethingInteresting();
};

class Child : public Parent {
 public:
  void Useful();
};

A CLIF specification for that might look like the following.

class Child:
  def SomethingInteresting(self)
  def Useful(self)

If the parent C++ class is already wrapped in another .clif file, use a Python-style import to define it as a base class, for example:

from full.path.to.another.python.wrapper import Parent
from "cpp/include/path/to/child.h":
  class Child(Parent):

Note that Python-style imports only enable defining base classes. An additional C-style import is needed if a parent C++ class also appears as a return type or argument type, for example:

from "full/path/to/another/python/wrapper_clif.h" import *
from full.path.to.another.python.wrapper import Parent
from "cpp/include/path/to/child.h":
  class Child(Parent):  # Needs the Python-style import.
    def SomeMethod(self) -> Parent  # Needs the C-style import.

Multiple inheritance in CLIF declaration is prohibited, but the C++ class being wrapped may have multiple parents according to the [Google C++ Style Guide] (https://google.github.io/styleguide/cppguide.html#Multiple_Inheritance).

Constructors

When wrapping a class don’t forget to describe its constructor (unless the default C++ constructor suffice, then def __init__(self) is redundant). Note that Python does not have function overloading, so a class can have only one constructor. Select the most useful one to expose as the class constructor.

class Foo {
 public:
  Foo();
  Foo(int special);
};

The default constructor will be unavailable if you define another constructor for Python like the following.

class Foo:
  def __init__(self, special: int)

Additional C++ constructors can be exposed by using the @add__init__ decorator. This will create a Python static method in the class as an alternative constructor.

class Foo:
  def __init__(self, special: int)
  @add__init__
  def Default(self)  # wraps Foo::Foo() constructor as Foo.Default()

Interface (experimental)

When you need to wrap several instantiations of a template class, you may skip repeating the template API in each class wrapper by using the interface.

The interface statement describes the C++ template class API, so that instantiations can simply refer to it.

To declare the API use interface instead of class. The names in <> are the template parameters and will be replaced with actual typenames during the class instantiation.

interface ProtoCache<Query, QueryResponse>:
  size_bytes: int = property(`size_bytes`)
  def Get(self, key: Query) -> (found: bool, val: QueryResponse):
    return ValueErrorOnFalse(...)
  def Put(self, key: Query, val: QueryResponse)
  def Clear(self)

To consume the API use the implements statement in the class, providing actual typenames for the interface parameters.

from "py/proto_cache.h":

  class SampleBatchQueryProtoCache:
    implements ProtoCache<SampleBatchQuery, SampleBatchQueryResponse>

  class RunInfoQueryProtoCache:
    implements ProtoCache<RunInfoQuery, RunInfoQueryResponse>

Currently the template class has to be defined in the same header file as the instantiations.

Iterator (experimental)

A C++ class with a std::iterator compatible implementation can be iterable in Python.

To declare the class iterable include a nested class with the Python name __iter__:

class I_Want_This_Class_To_Be_Iterable:
  class `iterator` as __iter__:
    def __next__(self) -> int

The __iter__ class must declare exactly one method __next__ that returns the type that *iterator has (typically named the value_type in C++).

TIP: Usually you want to wrap a const_iterator.

var statement

The var statement describes a C++ public member variable.

Note that var is the only statement that has no keyword.

NAME: TYPE

A variable can be any addressable member of C++ class/struct that is not static. To circumvent this restriction use property described below.

Python receives a copy of a C++ variable value on each attribute access. This is counterintuitive to how most people think about Python as such an attribute access is not a simple reference.

In case of containers, updating that copy without reassigning it back into the the class variable will not change the class variable value.

myclass.tags.append("manual")  # Does not update wrapped myclass.tags!

# To update it, the assignment must be explicit:
myclass.tags += ["manual"]

Un-property

To remind the user about the copy instead of letting them incorrectly assume that an attribute access is a reference, you might want to use @getter (and @setter) function decorators to declare Python methods to get (and set) the C++ variable instead of exposing an attribute. That can be thought of as the reverse of the property feature seen below. Both the getter and setter must use the C++ variable name as the C++ name of the function.

For example the following C++ class

struct Stat {
  struct Options {
    int length;
  };
  Options opt;
};

can be wrapped as

  class Stat:
    class Options:
      length: int
    @getter
    def `opt` as get_options(self) -> Options
    @setter
    def `opt` as set_options(self, o: Options)

Property

If a C++ class has getters and setters, consider using them as Python property rather than calling getters and setters as functions from Python. Direct access to instance variables is more Pythonic and makes programs more readable.

NAME: TYPE = property(`getter`, `setter`)

The getter is a C++ function returning TYPE (TYPE getter();) and the setter is a C++ function taking TYPE (void setter(TYPE);). To have a read-only property just use only the getter.

The var statement is most useful in describing plain C structs. If we have a struct with mostly data members, it can be described as

from "file/base/fileproperties_pyclif.h" import *

from file.base.fileproperties_pb2 import FileProperties

from "file/base/filestat.h":

  class FileStat:
    length: int
    mtime:  `time_t` as int
    # ...
    properties: FileProperties = property(`file_properties`)
    def IsDirectory(self) -> bool
    def Clear(self)

staticmethods statement

The staticmethods statement facilitates wrapping class static member functions. It has a nested block that can only contain def statements. Like the namespace statement, this statement puts a limit where CLIF can find the function, ie. search only inside the named class.

from "some/path/my_library.h"
  staticmethods from `Foo`:
    def Bar()
    def Baz()

In that example Foo::Bar and Foo::Baz must be static members of class Foo and will be wrapped as module-level functions some.path.my_library.Bar and some.path.my_library.Baz.

TIP: The C++ class name can be fully qualified.

pass statement

The pass statement allows you to wrap a C++ class without any API. It has two use cases:

  1. A “capsule with memory management”, ie. allow instance destruction if it was owned by Python.
  2. A derived class that adds nothing to the interface of the base class.
from "some/file.h":

  class Base:
    def SomeApi(self)

  class Derived(Base):
    pass

In that example Derived has the same API as Base, ie. SomeApi() but may have a different C++ implementation which is useful for testing.

use statement

The use statement reassigns a default C++ type for a given CLIF type:

use `std::string` as str

This statement is rarely needed. See more on types below.

Type correspondence

CLIF uses Python types in API descriptions (.clif files). Generally it’s CLIF’s job to find the corresponding C++ types automatically. However, it is common that multiple C++ types are converted to the same Python type, e.g. C++ std::unordered_set and std::set are both converted to the Python set type. In such situations only one of the conversions will work implicitly (this is a limitation of the implementation), while all others need to be specified explictly, e.g.:

C++:

void pass_unordered_set_int(const std::unordered_set<int>& values);
std::unordered_set<int> return_unordered_set_int();

void pass_set_int(const std::set<int>& values);
std::set<int> return_set_int();

.clif:

def pass_unordered_set_int(values: set<int>)
def return_unordered_set_int() -> set<int>

def pass_set_int(values: `std::set` as set<int>)
def return_set_int() -> `std::set` as set<int>

What works implicitly can be customized with the [use][use] statement.

The syntax for nested types is, e.g.:

C++:

void pass_set_list_int(const std::set<std::list<int>>& clusters);

.clif:

def pass_set_list_int(clusters: `std::set` as set<`std::list` as list<int>>)

Note that the backtick syntax also works for simpler types, e.g.:

C++:

void pass_size_t(std::size_t value);

.clif:

def pass_size_t(value: `std::size_t` as int)

However, in most cases the simpler

def pass_size_t(value: int)

will also work, if there is an implicit C++ conversion (in this example between std::size_t and int).

NOTE: CLIF will reject unknown types and produce an error. It can be parse-time error for CLIF types or compile-time error for C++ types.

Predefined types

CLIF knows some basic types (predefined via clif/python/types.h) including:

Default C++ type CLIF type5
int int
string bytes or str
bool bool
double float
complex<> complex
vector<> list<>
pair<> tuple<>
unordered_set<> set<>
unordered_map<> dict<>
std::function<R(T, U)> (t: T, u: U) -> R
PyObject* object6

Note: Default in the header row above means that the C++ type does not have to be specified explicitly in .clif files (unless a use statement changes the default).

CLIF also knows how to handle various other types including:

C++ type CLIF type
[u]intXX_t (e.g. int8_t) int
float float
map dict
set set
list, array, stack list
deque, queue, priority_queue list
const char* (as return value only) str (bytes is not supported)

Unicode

Please note that we want the C++ API to be explicit and while C++ does not distinguish between bytes and unicode, Python does. It means that Python .clif files must specify what exact type (bytes or unicode) the C++ code expects or produces.

However, CLIF always takes Python unicode and implicitly encodes it using UTF-8 for C++. To get unicode back to Python 2, use unicode as the return datatype. In Python 3, str gets converted to unicode automatically.

That can be summarized as below.

CLIF type On input On output CLIF returns
bytes (*) bytes
str (*) native str
unicode (*) unicode

(*) CLIF will take bytes or unicode Python object and pass [UTF-8 encoded] data to C++.

Encoding

UTF-8 encoding assumed on C++ side.

  1. When exposing a C++ function as __len__ make sure it only returns a non-negative numbers or Python will raise a SystemError

  2. Except chr that is already ‘imported’ by CLIF. 

  3. C++ 11 class enum converted to Enum, old-style enum to IntEnum

  4. https://pypi.python.org/pypi/enum34 

  5. CLIF types named after the corresponding Python types. 

  6. Be careful when you use object, CLIF assumes you know what you’re doing with Python C API and all its caveats.