Google APIs Client Library for C++ | A C++ library for client applications to access Google APIs. |
This document discusses the common abstractions used throughout the Google APIs Client Library for C++ when reading and writing data. It describes the concepts, abstract interface, and concrete component implementations provided. It also explains how to write your own custom readers to obtain data from other sources.
Contents
Basic concepts
Reading data and DataReader
A DataReader
reads a sequence of bytes in whole or part.
The Google APIs Client Library for C++ uses DataReader
to
pass non-trivial data into methods, particularly where the data might not
naturally reside in a chunk of memory. There are different
implementations of DataReader
that store the data
differently, such as in a continuous memory buffer or on disk.
Reading data from a DataReader
incrementally advances
through the sequence. Ideally a DataReader
is seekable
to any offset, forward or backward, in the byte stream by calling
SetOffset method. However SetOffset
is not required or
guaranteed to work on all DataReader
classes. Most readers,
including all the ones provided with the SDK are normally seekable.
Some readers store their data externally, such as on disk, so might
become unseekable if the external device becomes unavailable. If the
reader cannot seek its offset back to the beginning, you will only be
able to read it once.
If the reader was for a request payload then the request
will fail if it attempts to resend. If the reader was for a response
payload then you will not be able to read the result multiple times.
Writing data and DataWriter
A DataWriter
is analogous to a DataReader
but writes a serialized stream. It is used by HttpRequest
to capture the response data. The DataWriter
acts as a
DataReader
factory, creating a reader that can read back
the response. For example the FileDataWriter
will stream
the results directly to disk and create a FileDataReader
that can stream the data back into the application should it want to
access the data.
Readers should be of a constant size. They do not necessarily need to know what that size is, but it should not change as they are being used. Readers do need to identify when they have reached the end of their content.
Managed vs unmanaged
Readers and writers come in two flavors: managed and unmanaged.
A managed reader (or writer) has a
callback that it will call when it is destroyed. Normally these
closures are used to delete the storage of the data being read, but
the callback may do anything. An unmanaged reader (or writer)
has a NULL
Closure
. Unmanaged readers/writers
rely one someone else to manage their storage, and they assume it is
available over the lifetime of their instance.
Typical use case examples
The following snippet is applicable to the examples listed below.
#include "googleapis/client/data/data_reader.h" #include "googleapis/client/data/data_writer.h" #include "googleapis/client/transport/http_request.h" #include "googleapis/client/transport/http_response.h" #include "googleapis/util/util/status.h" using googleapis_client::HttpRequest; using googleapis_client::HttpResponse; using googleapis_client::DataReader; using googleapis_client::DataWriter;
Providing an in-memory string or byte sequence
The simplest reader copies the data to private memory that it will manage.
This function hides the Closure
that you would normally need
to provide to a managed reader. It is still considered managed because
the reader owns its data.
string* dynamic_string = new string("Hello World"); scoped_ptr<DataReader> reader( googleapis_client::NewManagedInMemoryDataReader(dynamic_string));
If the data you want to read is already in memory, and will remain so over the lifetime of the reader, then you can use an unmanaged reader on it and avoid the data copy.
scoped_ptr<DataReader> reader( googleapis_client::NewUnmanagedInMemoryDataReader("Hello World"));
If the data you want read is in a std::string*
that you own,
you can pass ownership to the reader and have it manage that string.
string local_string = "Hello World"; scoped_ptr<DataReader> reader( googleapis_client::NewManagedInMemoryDataReader(local_string));
Providing a disk-based string or byte sequence
The following example creates a DataReader
that references
the data in a file.
// Will delete the path when we delete the reader by calling // bool File::Delete(string&). Note that we don't care about the // result. This existing function just happens to have one. extern void DeleteFile(const string& path); Closure* closure = NewCallback(&DeleteFile, path); scoped_ptr<DataReader> reader( googleapis_client::NewManagedFileDataReader(path, closure));
Aggregating non-contiguous content
The following example demonstrates a reader that presents a sequence of bytes across different sources. In this particular example, it is coming from fragmented memory. It could just as easily be joining different types of readers together. The consumer will not distinguish the boundaries among the different fragments.
vector<DataReader*>* parts = new vector<DataReader*>; parts->push_back( googleapis_client::NewUnmanagedInMemoryDataReader("Hello")); parts->push_back(googleapis_client::NewUnmanagedInMemoryDataReader(" ")); parts->push_back( googleapis_client::NewUnmanagedInMemoryDataReader("World")); Closure* closure = NewCompositeReaderListAndContainerDeleter(parts); scoped_ptr<DataReader> reader( googleapis_client::NewManagedCompositeDataReader(*parts, closure));
Available DataReader and DataWriter classes
Data readers are created using free functions defined in
data_reader.h
. This section introduces the
different types of available readers. Each comes in both
flavors of managed and unmanaged.
All the DataReaders listed below, other than the InvalidDataReader, are seekable.
- InvalidDataReader
- An InvalidDataReader always fails. It is used in
places where a
DataReader
may be needed but there is no valid data to provide. For example, scenarios in which there was an error obtaining the data. The factory functions may in fact return an InvalidDataReader if they would otherwise fail. If you need to distinguish, check the error() status. - InMemoryDataReader
- An InMemoryDataReader is a
DataReader
with an in-memory buffer, including strings. Note that strings can be used even for binary data. InMemoryDataReader can always SetOffset. - FileDataReader
- A FileDataReader is a
DataReader
that reads from files. This is probably a better choice than the generic StreamDataReader if the source is in fact a file. FileDataReader can be expected to SetOffset unless there is some kind of external failure such as the file being corrupted or a new OS-level failure. - CompositeDataReader
- A CompositeDataReader is a
DataReader
that aggregates other data readers into a single byte sequence. It is useful for joining fragments together into a single stream. CompositeDataReaders will SetOffset as long as the readers they reference can SetOffset as well. - IStreamDataReader
- A IStreamDataReader is a
DataReader
that wraps astd::istream
C++ stream. IStreams can only SetOffset as reliably as the underlying std::istream can seek to the desired offset.
Adding new types of data readers
To write your own specialized reader you must minimally subclass
DataReader
and add the following:
- Provide an implementation of
DoReadToBuffer
that reads from your data source. The starting position will be theoffset()
attribute, which is automatically managed by the base class. Your method implementation can read fewer than the requested number of bytes per invocation if you wish or run out of data. Be sure to callset_done
when it finishes orset_status
if an error is encountered. Returning0
does not imply being done or having an error. - Optionally (but recommended) provide an implementation of
DoSetOffset
which permits the reader to start reading from your source at the specified offset. You should not seek past the end however. When you hit the end, return the ending position even if less than the position asked for. If you make your class reliably seekable, also override the seekable() method to indicate that. If you leave the base class method then attempts toSetOffset
will fail making your reader only single-use. - Optionally call
set_total_length
in the constructor. This will be the result ofTotalLengthIfKnown
, which can be helpful. It is not strictly required. - Provide Managed and Unmanaged free functions to create your reader.
Having these free functions is not required, but is the convention
we recommend for consistency. If your factory method cannot create
your reader as requested, you can return an
New*InvalidDataReader
using either the managed or unmanaged as appropriate.
Available DataReader and DataWriter classes
Data writers are created using free functions
defined in data_writer.h
. This section introduces the
different types of available writers. Each comes in both
flavors of managed and unmanaged.
- FileDataWriter
- A FileDataWriter is a
DataWriter
that stores the byte sequence in a file. These writers create FileDataReader to provide access to read back the data. - StringDataWriter
-
A StringDataWriter is a
DataWriter
that stores the byte sequence in an in-memorystd::string
. If you supply the string to write into, usestring.data()
rather thanstring.c_str()
when accessing the data written because the writer may be producing binary data rather than text. These writers create an InMemoryDataReader to provide access to read back the data.
Adding new types of data writer
To write your own specialized reader you must minimally subclass
DataWriter
and add the following:
- Implement
DoWrite
to append bytes incrementally to your storage. - Implement
DoNewDataReader
to create aDataReader
that can read the bytes back from your storage. If the method is passed a non-NULL
Closure
then return a managed reader using the provided closure.