View on GitHub

proto-lens

API for protocol buffers using modern Haskell language and library patterns.

proto-lens-tutorial

Table of Contents

  1. Message Generation
  2. Oneof Generation
  3. Enum Generation
  4. Field Overloading
  5. Any
  6. Repeated
  7. Map
  8. Lens Laws
  9. Example: Person
  10. Example: Coffee Order

Message Generation

messages that are defined in a .proto file are generated as Haskell records. Given instances to various typeclasses for making their use more ergonomic in code use.

A message may be defined in a file foo.proto:

syntax="proto3";

message Bar {
  int32 baz = 1;
  string bippy = 2;
}

This will generate a Foo module with the a Bar record containing fields _Bar'baz and _Bar'bippy:

data Bar = Bar
  { _Bar'baz   :: !Prelude.Float
  , _Bar'bippy :: !Data.Text.Text
  , _Foo'_unknownFields :: !Data.ProtoLens.FieldSet
  }
  deriving (Prelude.Show, Prelude.Eq, Prelude.Ord)

Notice _Foo'_unknownFields :: !Data.ProtoLens.FieldSet; it stores fields that are not recognized during deserialization (for example, if the message was generated by a newer version of the .proto file), those fields will be included if the message is later serialized.

Instances generated are:

Oneof Generation

A oneof group is generated as a field in the record, where the field is a Maybe in case none of the cases are present (or a case is present from a later version of the proto).

A message with a oneof may be defined in a file foo.proto:

syntax="proto3";

message Foo {
  oneof bar {
    int32 baz = 1;
    string bippy = 2;
  }
}

This will generate a Foo module with the a Bar record containing the field _Foo'bar and a coproduct Foo'Bar with constructors Foo'Baz and Foo'Bippy. On top of this, Prism' functions will be generated for the sum type, in this case Foo'Bar, one for each oneof field:

data Foo = Foo
  { _Foo'bar :: !(Prelude.Maybe Foo'Bar)
  , _Foo'_unknownFields :: !Data.ProtoLens.FieldSet
  }
  deriving (Prelude.Show, Prelude.Eq, Prelude.Ord)

data Foo'Bar = Foo'Baz !Data.Int.Int32
             | Foo'Bippy !Data.Text.Text
             deriving (Prelude.Show, Prelude.Eq, Prelude.Ord)

_Foo'Baz :: Data.ProtoLens.Prism.Prism' Foo'Bar Data.Int.Int32
_Foo'Baz
 = Data.ProtoLens.Prism.prism' Foo'Baz
     (\ p__ ->
        case p__ of
            Foo'Baz p__val -> Prelude.Just p__val
            _otherwise -> Prelude.Nothing)

_Foo'Bippy :: Data.ProtoLens.Prism.Prism' Foo'Bar Data.Text.Text
_Foo'Bippy
 = Data.ProtoLens.Prism.prism' Foo'Bippy
     (\ p__ ->
        case p__ of
            Foo'Bippy p__val -> Prelude.Just p__val
            _otherwise -> Prelude.Nothing)

The Prism' functions allow us to succinctly focus on one branch of the sum type for our Message, for example:

import Data.ProtoLens.Prism

accessBaz :: Foo -> Maybe Int32
accessBaz foo = foo
             ^? maybe'bar -- We want to look at the 'bar' oneof field
              . _Just     -- We only care if this value is set with a `Just`
              . _Foo'Baz  -- Focus on the 'baz' branch of our sum type

-- | Creates a 'Foo' with an incoming 'Int32'
createFoo :: Int32 -> Foo
createFoo i = defMessage & maybe'bar .~ (_Just # _Foo'Baz # i)

-- | Sets a new `bippy` value
updateFoo :: Text -> Foo -> Foo
updateFoo s foo = foo & maybe'bar ?~ (_Foo'Bippy # s)

Our previously mentioned instances are generated but we will note the following about HasField:

Enum Generation

enums that are defined in a .proto file are generated as Haskell coproducts. Given instances to various typeclasses for making their use more ergonomic in code use.

An enum may be defined in a file foo.proto:

syntax="proto3";

enum Baz {
  BAZ1 = 0;
  BAZ2 = 1;
}

This will generate a Foo module with the a Bar coproduct containing three constructors:

data Baz = BAZ1
         | BAZ2
         | Baz'Unrecognized !Baz'UnrecognizedValue
         deriving (Prelude.Show, Prelude.Eq, Prelude.Ord)

The Bar'Unrecognized constructor will be created during deserialization if the enum field is set to a numeric value that is not listed for that enum in the .proto file. For example, given the case above, the user calls decodeProto which encounters the numeric value 2 the value set for Baz would be Baz'Unrecognized (Baz'UnrecognizedValue 2).

When using proto2 syntax there are a few notes to remember when using enum data:

When using proto3 syntax it is important to remember that the first enum value must be zero.

Instances generated are:

Field Overloading

When we look at having the message:

syntax="proto3";

message Bar {
  int32 baz = 1;
  string bippy = 2;
}

we said that baz and bippy accessors are created via HasField instances. If we add a further message into the mix such as:

message Foo {
  string baz = 1;
}

we can see that baz is common to both Bar and Foo. The difference will be that the instances for HasField will be:

instance HasField Foo "baz" (Data.Text.Text)

instance HasField Bar "baz" (Data.Int.Int32)

The fields are overloaded on the symbol baz but connect Foo to Text and Bar to Int32. Then we can find that there is one, polymorphic definition in the Foo_Fields.hs file:

baz :: HasField s "baz" a => Lens' s a
baz = Data.ProtoLens.Field.field @"baz"

If we have any other records that also contain baz from other modules these lenses could also be used to access them. We should take care in these cases as to only import one version of baz when we are doing this, otherwise name clashes will occur.

The use of baz can be done in three ways; which way you choose is up to you and your style.

OverloadedLabels

The first method is by using the OverloadedLabels extension and importing the orphan instance of IsLabel for the Lens type from Data.ProtoLens.Labels. That gives us the use of # for prefixing our field accessors.

{-# LANGUAGE OverloadedLabels  #-}
{-# LANGUAGE OverloadedStrings #-}

import Data.ProtoLens        (defMessage)
import Data.ProtoLens.Labels ()
import Lens.Micro            ((&), (.~), (^.))
import Proto.Foo             as P

myBar :: P.Bar
myBar = defMessage
            & #baz   .~ 42
            & #bippy .~ "querty"

main :: IO ()
main = print $ myBar ^. #bippy

The fields function

The second method uses the TypeApplications extension and the function Data.ProtoLens.Field.field. It leads to slightly more noisy syntax, but has the advantage of not using orphan instances.

{-# LANGUAGE DataKinds         #-}
{-# LANGUAGE OverloadedStrings #-}
{-# LANGUAGE TypeApplications  #-}

import Data.ProtoLens       (defMessage)
import Data.ProtoLens.Field (field)
import Lens.Micro           ((&), (.~), (^.))
import Proto.Foo            as P

myBar :: P.Bar
myBar = defMessage
            & field @"baz"   .~ 42
            & field @"bippy" .~ "querty"

main :: IO ()
main = print $ myBar ^. field @"bippy"

The *_Fields.hs module

The last method is by importing the *_Fields.hs module, for example:

{-# LANGUAGE OverloadedStrings #-}

import Data.ProtoLens   (defMessage)
import Lens.Micro       ((&), (.~), (^.))
import Proto.Foo        as P
import Proto.Foo_Fields as P

myBar :: P.Bar
myBar = defMessage
            & P.baz   .~ 42
            & P.bippy .~ "querty"

main :: IO ()
main = print $ myBar ^. P.bippy

This approach is less flexible, since it may require manual adjustment when the same name is defined in two different .proto files, and thus exported by both of their *_Fields modules. If that happens, you can resolve the conflict by importing the definition from exactly one of the modules, and using that name with both of their types. For example:

{-# LANGUAGE OverloadedStrings #-}

import Data.ProtoLens      (defMessage)
import Lens.Micro          ((&), (.~), (^.))
import Proto.Foo           as P
import Proto.Other         as P
import Proto.Foo_Fields    (baz)
import Proto.Other_Fields  (bippy)

myBar :: P.Bar
myBar = defMessage
            & baz   .~ 42
            -- Note: field identifiers from one proto module are compatible
            -- with the types in any other one.
            & bippy .~ "querty"

myOther :: P.Other
myOther = defMessage & bippy .~ 42

main :: IO ()
main = do
    print (myBar ^. bippy)
    print (myOther ^. bippy)

Any

An Any field stands for any arbitrary message and thus is represented by an arbitrary blob of bytes. We can see this as a placeholder for any user defined message where the message becomes concrete when we unpack it to some message we have chosen. There are two utility functions for packing any Message a into an Any and its dual for unpacking an Any into a Message a. These functions are called pack and unpack respectively and their type signatures are below:

pack :: forall a. Message a => a -> Any
unpack :: forall a. Message a => Any -> Either UnpackError a

The Any type and its utility functions are provided by the Data.ProtoLens.Any module in the proto-lens-protobuf-types package.

Further information on Any and how it works in the protocol can found in the official documentation

Repeated

repeated fields signify that the type of the field is a list of values, naturally fitting to the [a] type in Haskell. For example:

message Foo {
  repeated int32 a = 1;
  repeated int32 b = 2 [packed=true];
}

generates:

data Foo = Foo
  { _Foo'a :: ![Data.Int.Int32]
  , _Foo'b :: ![Data.Int.Int32]
  , _Foo'_unknownFields :: !Data.ProtoLens.FieldSet
  }
  deriving (Prelude.Show, Prelude.Eq, Prelude.Ord)

Map

map fields signify that the type of the field is mapping from one value to another, naturally fitting to the Data.Map a b type in Haskell. For example:

message Foo {
  map<int32, string> bar = 1;
}

generates:

data Foo = Foo
  { _Foo'bar :: !(Data.Map.Map Data.Int.Int32 Data.Text.Text)
  , _Foo'_unknownFields :: !Data.ProtoLens.FieldSet
  }
  deriving (Prelude.Show, Prelude.Eq, Prelude.Ord)

data Foo'BarEntry = Foo'BarEntry
  { _Foo'BarEntry'key :: !Data.Int.Int32
  , _Foo'BarEntry'value :: !Data.Text.Text
  , _Foo'BarEntry'_unknownFields :: !Data.ProtoLens.FieldSet
  }
  deriving (Prelude.Show, Prelude.Eq, Prelude.Ord)

Foo'BarEntry is generated due to backwards compatability, so we can ignore this generated code and focus on the fact that we can treat this data as a regular Haskell Map.

Encode/Decode and Show/Read

Data.ProtoLens provides utilities for converting to and from ByteString and Text values. For ByteString we are provided the encodeMessage and decodeMessage functions. For Text we are provided the showMessage and readMessage functions. The former are used for encoding and decoding to/from wire format. While the latter are used converting and parsing human readable representations. The type signatures for these functions are given below:

encodeMessage :: Message msg => msg        -> ByteString
decodeMessage :: Message msg => ByteString -> Either String msg

showMessage :: Message msg => msg  -> String
readMessage :: Message msg => Text -> Either String msg

Lens Laws

Underneath there is a function that is used for creating lenses:

Data.ProtoLens.maybeLens :: a -> Lens' (Maybe a) a

We should note that maybeLens does not satisfy the lens laws, which expect that:

set l (view l x) == x

An example of an offending case is:

set (maybeLens 'a') (view (maybeLens 'a') Nothing) == Just 'a'

However, this is the behavior generally expected by users, and only matters if we’re explicitly checking whether a field is set.

Another pitfall is when interacting with oneof fields it is possible to clear existing values. For example if we have the following proto:

message Foo {
  oneof bar {
    int32 baz = 1;
    string bippy = 2;
  }
}

we can end up doing the following:

fooVal :: P.Foo
fooVal = defMessage & P.maybe'baz ?~ 42

fooVal' :: P.Foo
fooVal' = fooVal & P.maybe'bippy .~ Nothing

main :: IO ()
main = do
  print fooVal  -- outputs: "{ bar: 42 }"
  print fooVal' -- outputs: "{}"

We have cleared the previously set Just (Foo'Baz 42) value by doing P.maybe'bippy .~ Nothing. To try and avoid this it would be best to organise your code by using the Prism' functions for oneof fields instead.