Google APIs Client Library for C++ |
A C++ library for client applications to access Google APIs. |
This document discusses the common abstractions used throughout the Google APIs Client Library for C++ when reading and writing data. It describes the concepts, abstract interface, and concrete component implementations provided. It also explains how to write your own custom readers to obtain data from other sources.
Contents
Basic concepts
Reading data and DataReader
A DataReader reads a sequence of bytes in whole or part.
The Google APIs Client Library for C++ uses DataReader to
pass non-trivial data into methods, particularly where the data might not
naturally reside in a chunk of memory. There are different
implementations of DataReader that store the data
differently, such as in a continuous memory buffer or on disk.
Reading data from a DataReader incrementally advances
through the sequence. Ideally a DataReader is seekable
to any offset, forward or backward, in the byte stream by calling
SetOffset method. However SetOffset is not required or
guaranteed to work on all DataReader classes. Most readers,
including all the ones provided with the SDK are normally seekable.
Some readers store their data externally, such as on disk, so might
become unseekable if the external device becomes unavailable. If the
reader cannot seek its offset back to the beginning, you will only be
able to read it once.
If the reader was for a request payload then the request
will fail if it attempts to resend. If the reader was for a response
payload then you will not be able to read the result multiple times.
Writing data and DataWriter
A DataWriter is analogous to a DataReader
but writes a serialized stream. It is used by HttpRequest
to capture the response data. The DataWriter acts as a
DataReader factory, creating a reader that can read back
the response. For example the FileDataWriter will stream
the results directly to disk and create a FileDataReader
that can stream the data back into the application should it want to
access the data.
Readers should be of a constant size. They do not necessarily need to know what that size is, but it should not change as they are being used. Readers do need to identify when they have reached the end of their content.
Managed vs unmanaged
Readers and writers come in two flavors: managed and unmanaged.
A managed reader (or writer) has a
callback that it will call when it is destroyed. Normally these
closures are used to delete the storage of the data being read, but
the callback may do anything. An unmanaged reader (or writer)
has a NULL Closure. Unmanaged readers/writers
rely one someone else to manage their storage, and they assume it is
available over the lifetime of their instance.
Typical use case examples
The following snippet is applicable to the examples listed below.
#include "googleapis/client/data/data_reader.h" #include "googleapis/client/data/data_writer.h" #include "googleapis/client/transport/http_request.h" #include "googleapis/client/transport/http_response.h" #include "googleapis/util/util/status.h" using googleapis_client::HttpRequest; using googleapis_client::HttpResponse; using googleapis_client::DataReader; using googleapis_client::DataWriter;
Providing an in-memory string or byte sequence
The simplest reader copies the data to private memory that it will manage.
This function hides the Closure that you would normally need
to provide to a managed reader. It is still considered managed because
the reader owns its data.
string* dynamic_string = new string("Hello World");
scoped_ptr<DataReader> reader(
googleapis_client::NewManagedInMemoryDataReader(dynamic_string));
If the data you want to read is already in memory, and will remain so over the lifetime of the reader, then you can use an unmanaged reader on it and avoid the data copy.
scoped_ptr<DataReader> reader(
googleapis_client::NewUnmanagedInMemoryDataReader("Hello World"));
If the data you want read is in a std::string* that you own,
you can pass ownership to the reader and have it manage that string.
string local_string = "Hello World";
scoped_ptr<DataReader> reader(
googleapis_client::NewManagedInMemoryDataReader(local_string));
Providing a disk-based string or byte sequence
The following example creates a DataReader that references
the data in a file.
// Will delete the path when we delete the reader by calling
// bool File::Delete(string&). Note that we don't care about the
// result. This existing function just happens to have one.
extern void DeleteFile(const string& path);
Closure* closure = NewCallback(&DeleteFile, path);
scoped_ptr<DataReader> reader(
googleapis_client::NewManagedFileDataReader(path, closure));
Aggregating non-contiguous content
The following example demonstrates a reader that presents a sequence of bytes across different sources. In this particular example, it is coming from fragmented memory. It could just as easily be joining different types of readers together. The consumer will not distinguish the boundaries among the different fragments.
vector<DataReader*>* parts = new vector<DataReader*>;
parts->push_back(
googleapis_client::NewUnmanagedInMemoryDataReader("Hello"));
parts->push_back(googleapis_client::NewUnmanagedInMemoryDataReader(" "));
parts->push_back(
googleapis_client::NewUnmanagedInMemoryDataReader("World"));
Closure* closure = NewCompositeReaderListAndContainerDeleter(parts);
scoped_ptr<DataReader> reader(
googleapis_client::NewManagedCompositeDataReader(*parts, closure));
Available DataReader and DataWriter classes
Data readers are created using free functions defined in
data_reader.h. This section introduces the
different types of available readers. Each comes in both
flavors of managed and unmanaged.
All the DataReaders listed below, other than the InvalidDataReader, are seekable.
- InvalidDataReader
- An InvalidDataReader always fails. It is used in
places where a
DataReadermay be needed but there is no valid data to provide. For example, scenarios in which there was an error obtaining the data. The factory functions may in fact return an InvalidDataReader if they would otherwise fail. If you need to distinguish, check the error() status. - InMemoryDataReader
- An InMemoryDataReader is a
DataReaderwith an in-memory buffer, including strings. Note that strings can be used even for binary data. InMemoryDataReader can always SetOffset. - FileDataReader
- A FileDataReader is a
DataReaderthat reads from files. This is probably a better choice than the generic StreamDataReader if the source is in fact a file. FileDataReader can be expected to SetOffset unless there is some kind of external failure such as the file being corrupted or a new OS-level failure. - CompositeDataReader
- A CompositeDataReader is a
DataReaderthat aggregates other data readers into a single byte sequence. It is useful for joining fragments together into a single stream. CompositeDataReaders will SetOffset as long as the readers they reference can SetOffset as well. - IStreamDataReader
- A IStreamDataReader is a
DataReaderthat wraps astd::istreamC++ stream. IStreams can only SetOffset as reliably as the underlying std::istream can seek to the desired offset.
Adding new types of data readers
To write your own specialized reader you must minimally subclass
DataReader and add the following:
- Provide an implementation of
DoReadToBufferthat reads from your data source. The starting position will be theoffset()attribute, which is automatically managed by the base class. Your method implementation can read fewer than the requested number of bytes per invocation if you wish or run out of data. Be sure to callset_donewhen it finishes orset_statusif an error is encountered. Returning0does not imply being done or having an error. - Optionally (but recommended) provide an implementation of
DoSetOffsetwhich permits the reader to start reading from your source at the specified offset. You should not seek past the end however. When you hit the end, return the ending position even if less than the position asked for. If you make your class reliably seekable, also override the seekable() method to indicate that. If you leave the base class method then attempts toSetOffsetwill fail making your reader only single-use. - Optionally call
set_total_lengthin the constructor. This will be the result ofTotalLengthIfKnown, which can be helpful. It is not strictly required. - Provide Managed and Unmanaged free functions to create your reader.
Having these free functions is not required, but is the convention
we recommend for consistency. If your factory method cannot create
your reader as requested, you can return an
New*InvalidDataReaderusing either the managed or unmanaged as appropriate.
Available DataReader and DataWriter classes
Data writers are created using free functions
defined in data_writer.h. This section introduces the
different types of available writers. Each comes in both
flavors of managed and unmanaged.
- FileDataWriter
- A FileDataWriter is a
DataWriterthat stores the byte sequence in a file. These writers create FileDataReader to provide access to read back the data. - StringDataWriter
-
A StringDataWriter is a
DataWriterthat stores the byte sequence in an in-memorystd::string. If you supply the string to write into, usestring.data()rather thanstring.c_str()when accessing the data written because the writer may be producing binary data rather than text. These writers create an InMemoryDataReader to provide access to read back the data.
Adding new types of data writer
To write your own specialized reader you must minimally subclass
DataWriter and add the following:
- Implement
DoWriteto append bytes incrementally to your storage. - Implement
DoNewDataReaderto create aDataReaderthat can read the bytes back from your storage. If the method is passed a non-NULLClosurethen return a managed reader using the provided closure.
Google APIs Client Library for C++