FlatBuffers
An open source project by FPL.
|
A set of low-level, high-performance static utility methods related to the UTF-8 character encoding. More...
A set of low-level, high-performance static utility methods related to the UTF-8 character encoding.
This class has no dependencies outside of the core JDK libraries.
There are several variants of UTF-8. The one implemented by this class is the restricted definition of UTF-8 introduced in Unicode 3.1, which mandates the rejection of "overlong" byte sequences as well as rejection of 3-byte surrogate codepoint byte sequences. Note that the UTF-8 decoder included in Oracle's JDK has been modified to also reject "overlong" byte sequences, but (as of 2011) still accepts 3-byte surrogate codepoint byte sequences.
The byte sequences considered valid by this class are exactly those that can be roundtrip converted to Strings and back to bytes using the UTF-8 charset, without loss:
Arrays.equals(bytes, new String(bytes, Internal.UTF_8).getBytes(Internal.UTF_8))
See the Unicode Standard, Table 3-6. UTF-8 Bit Distribution, Table 3-7. Well Formed UTF-8 Byte Sequences.
Classes | |
class | UnpairedSurrogateException |
Public Member Functions | |
String | decodeUtf8 (ByteBuffer buffer, int offset, int length) throws IllegalArgumentException |
Decodes the given UTF-8 portion of the ByteBuffer into a String. More... | |
int | encodedLength (CharSequence in) |
Returns the number of bytes in the UTF-8-encoded form of. More... | |
void | encodeUtf8 (CharSequence in, ByteBuffer out) |
Encodes the given characters to the target ByteBuffer using UTF-8 encoding. More... | |
Static Public Member Functions | |
static String | decodeUtf8Array (byte[] bytes, int index, int size) |
static String | decodeUtf8Buffer (ByteBuffer buffer, int offset, int length) |
Static Public Member Functions inherited from com.google.flatbuffers.Utf8 | |
static int | encodeUtf8CodePoint (CharSequence in, int start, byte[] out) |
Encode a Java's CharSequence UTF8 codepoint into a byte array. More... | |
static Utf8 | getDefault () |
Get the default UTF-8 processor. More... | |
static void | setDefault (Utf8 instance) |
Set the default instance of the UTF-8 processor. More... | |
|
inline |
Decodes the given UTF-8 portion of the ByteBuffer into a String.
IllegalArgumentException | if the input is not valid UTF-8. |
Reimplemented from com.google.flatbuffers.Utf8.
|
inline |
Returns the number of bytes in the UTF-8-encoded form of.
. For a string, this method is equivalent to
, but is more efficient in both time and space.
IllegalArgumentException | if sequence
|
Reimplemented from com.google.flatbuffers.Utf8.
|
inline |
Encodes the given characters to the target ByteBuffer using UTF-8 encoding.
Selects an optimal algorithm based on the type of ByteBuffer (i.e. heap or direct) and the capabilities of the platform.
in | the source string to be encoded |
out | the target buffer to receive the encoded string. |
Reimplemented from com.google.flatbuffers.Utf8.