FlatBuffers
An open source project by FPL.
com.google.flatbuffers.Utf8Safe Class Reference

A set of low-level, high-performance static utility methods related to the UTF-8 character encoding. More...

Inheritance diagram for com.google.flatbuffers.Utf8Safe:
com.google.flatbuffers.Utf8

Detailed Description

A set of low-level, high-performance static utility methods related to the UTF-8 character encoding.

This class has no dependencies outside of the core JDK libraries.

There are several variants of UTF-8. The one implemented by this class is the restricted definition of UTF-8 introduced in Unicode 3.1, which mandates the rejection of "overlong" byte sequences as well as rejection of 3-byte surrogate codepoint byte sequences. Note that the UTF-8 decoder included in Oracle's JDK has been modified to also reject "overlong" byte sequences, but (as of 2011) still accepts 3-byte surrogate codepoint byte sequences.

The byte sequences considered valid by this class are exactly those that can be roundtrip converted to Strings and back to bytes using the UTF-8 charset, without loss:

Arrays.equals(bytes, new String(bytes, Internal.UTF_8).getBytes(Internal.UTF_8))

See the Unicode Standard, Table 3-6. UTF-8 Bit Distribution, Table 3-7. Well Formed UTF-8 Byte Sequences.

Classes

class  UnpairedSurrogateException
 

Public Member Functions

String decodeUtf8 (ByteBuffer buffer, int offset, int length) throws IllegalArgumentException
 Decodes the given UTF-8 portion of the ByteBuffer into a String. More...
 
int encodedLength (CharSequence in)
 Returns the number of bytes in the UTF-8-encoded form of. More...
 
void encodeUtf8 (CharSequence in, ByteBuffer out)
 Encodes the given characters to the target ByteBuffer using UTF-8 encoding. More...
 

Static Public Member Functions

static String decodeUtf8Array (byte[] bytes, int index, int size)
 
static String decodeUtf8Buffer (ByteBuffer buffer, int offset, int length)
 
- Static Public Member Functions inherited from com.google.flatbuffers.Utf8
static int encodeUtf8CodePoint (CharSequence in, int start, byte[] out)
 Encode a Java's CharSequence UTF8 codepoint into a byte array. More...
 
static Utf8 getDefault ()
 Get the default UTF-8 processor. More...
 
static void setDefault (Utf8 instance)
 Set the default instance of the UTF-8 processor. More...
 

Member Function Documentation

◆ decodeUtf8()

String com.google.flatbuffers.Utf8Safe.decodeUtf8 ( ByteBuffer  buffer,
int  offset,
int  length 
) throws IllegalArgumentException
inline

Decodes the given UTF-8 portion of the ByteBuffer into a String.

Exceptions
IllegalArgumentExceptionif the input is not valid UTF-8.

Reimplemented from com.google.flatbuffers.Utf8.

◆ encodedLength()

int com.google.flatbuffers.Utf8Safe.encodedLength ( CharSequence  sequence)
inline

Returns the number of bytes in the UTF-8-encoded form of.

sequence

. For a string, this method is equivalent to

string.getBytes(UTF_8).length

, but is more efficient in both time and space.

Exceptions
IllegalArgumentExceptionif
sequence
contains ill-formed UTF-16 (unpaired surrogates)

Reimplemented from com.google.flatbuffers.Utf8.

◆ encodeUtf8()

void com.google.flatbuffers.Utf8Safe.encodeUtf8 ( CharSequence  in,
ByteBuffer  out 
)
inline

Encodes the given characters to the target ByteBuffer using UTF-8 encoding.

Selects an optimal algorithm based on the type of ByteBuffer (i.e. heap or direct) and the capabilities of the platform.

Parameters
inthe source string to be encoded
outthe target buffer to receive the encoded string.

Reimplemented from com.google.flatbuffers.Utf8.


The documentation for this class was generated from the following file: