ParsableByteArray

Kotlin |Java

@UnstableApi
@CheckReturnValue
public final class ParsableByteArray

Wraps a byte array, providing a set of methods for parsing data from it. Numerical values are parsed with the assumption that their constituent bytes are in big endian order.

Summary

Constants
`static final int`	`INVALID_CODE_POINT = 1114112` A value that is outside the valid range of unicode code points.

Public fields
`byte[]`	`data`
`int`	`position`

Public constructors
`ParsableByteArray()` Creates a new instance that initially has no backing data.
`ParsableByteArray(byte[] data)` Creates a new instance wrapping `data`, and sets the limit to `data.length`.
`ParsableByteArray(int limit)` Creates a new instance with `limit` bytes and sets the limit.
`ParsableByteArray(byte[] data, int limit)` Creates a new instance that wraps an existing array.

Public methods
`int`	`bytesLeft()` Returns the number of bytes yet to be read.
`int`	`capacity()` Returns the capacity of the array, which may be larger than the limit.
`void`	`ensureCapacity(int requiredCapacity)` Ensures the backing array is at least `requiredCapacity` long.
`byte[]`	`getData()` Returns the underlying array.
`int`	`getPosition()` Returns the current offset in the array, in bytes.
`int`	`limit()` Returns the limit.
`char`	`peekChar()` Peeks at the next two bytes and interprets them as a big-endian char.
`char`	`peekChar(Charset charset)` This method is deprecated. Either use `peekChar` to peek the next two bytes (big-endian) or `peekCodePoint` to peek in a `Charset`-aware way.
`int`	`peekCodePoint(Charset charset)` Peeks at the code point starting at `getPosition` as interpreted by `charset`.
`int`	`peekInt()` Peeks the next four bytes as a signed value.
`int`	`peekUnsignedByte()` Peeks at the next byte as an unsigned value.
`int`	`peekUnsignedInt24()` Peeks the next three bytes as an unsigned value.
`void`	`readBytes(ParsableBitArray bitArray, int length)` Reads the next `length` bytes into `bitArray`, and resets the position of `bitArray` to zero.
`void`	`readBytes(ByteBuffer buffer, int length)` Reads the next `length` bytes into `buffer`.
`void`	`readBytes(byte[] buffer, int offset, int length)` Reads the next `length` bytes into `buffer` at `offset`.
`@Nullable String`	`readDelimiterTerminatedString(char delimiter)` Reads up to the next delimiter byte (or the limit) as UTF-8 characters.
`double`	`readDouble()` Reads the next eight bytes as a 64-bit floating point value.
`float`	`readFloat()` Reads the next four bytes as a 32-bit floating point value.
`int`	`readInt()` Reads the next four bytes as a signed value
`int`	`readInt24()` Reads the next three bytes as a signed value.
`@Nullable String`	`readLine()` Reads a line of text in UTF-8.
`@Nullable String`	`readLine(Charset charset)` Reads a line of text in `charset`.
`int`	`readLittleEndianInt()` Reads the next four bytes as a signed value in little endian order.
`int`	`readLittleEndianInt24()` Reads the next three bytes as a signed value in little endian order.
`long`	`readLittleEndianLong()` Reads the next eight bytes as a signed value in little endian order.
`short`	`readLittleEndianShort()` Reads the next two bytes as a signed value.
`long`	`readLittleEndianUnsignedInt()` Reads the next four bytes as an unsigned value in little endian order.
`int`	`readLittleEndianUnsignedInt24()` Reads the next three bytes as an unsigned value in little endian order.
`int`	`readLittleEndianUnsignedIntToInt()` Reads the next four bytes as a little endian unsigned integer into an integer, if the top bit is a zero.
`int`	`readLittleEndianUnsignedShort()` Reads the next two bytes as an unsigned value.
`long`	`readLong()` Reads the next eight bytes as a signed value.
`@Nullable String`	`readNullTerminatedString()` Reads up to the next NUL byte (or the limit) as UTF-8 characters.
`String`	`readNullTerminatedString(int length)` Reads the next `length` bytes as UTF-8 characters.
`short`	`readShort()` Reads the next two bytes as a signed value.
`String`	`readString(int length)` Reads the next `length` bytes as UTF-8 characters.
`String`	`readString(int length, Charset charset)` Reads the next `length` bytes as characters in the specified `Charset`.
`int`	`readSynchSafeInt()` Reads a Synchsafe integer.
`int`	`readUnsignedByte()` Reads the next byte as an unsigned value.
`int`	`readUnsignedFixedPoint1616()` Reads the next four bytes, returning the integer portion of the fixed point 16.16 integer.
`long`	`readUnsignedInt()` Reads the next four bytes as an unsigned value.
`int`	`readUnsignedInt24()` Reads the next three bytes as an unsigned value.
`int`	`readUnsignedIntToInt()` Reads the next four bytes as an unsigned integer into an integer, if the top bit is a zero.
`int`	`readUnsignedLeb128ToInt()` Reads an unsigned variable-length LEB128 value into an int.
`long`	`readUnsignedLeb128ToLong()` Reads an unsigned variable-length LEB128 value into a long.
`long`	`readUnsignedLongToLong()` Reads the next eight bytes as an unsigned long into a long, if the top bit is a zero.
`int`	`readUnsignedShort()` Reads the next two bytes as an unsigned value.
`long`	`readUtf8EncodedLong()` Reads a long value encoded by UTF-8 encoding
`@Nullable Charset`	`readUtfCharsetFromBom()` Reads a UTF byte order mark (BOM) and returns the UTF `Charset` it represents.
`void`	`reset(byte[] data)` Updates the instance to wrap `data`, and resets the position to zero and the limit to `data.length`.
`void`	`reset(int limit)` Resets the position to zero and the limit to the specified value.
`void`	`reset(byte[] data, int limit)` Updates the instance to wrap `data`, and resets the position to zero.
`void`	`setLimit(int limit)` Sets the limit.
`void`	`setPosition(int position)` Sets the reading offset in the array.
`static void`	`@VisibleForTesting setShouldEnforceLimitOnLegacyMethods(boolean enforceLimit)` Sets whether all read/peek methods should enforce that `getPosition` never exceeds `limit`.
`void`	`skipBytes(int bytes)` Moves the reading offset by `bytes`.
`void`	`skipLeb128()` Skips a variable-length LEB128 value.

Constants

INVALID_CODE_POINT

public static final int INVALID_CODE_POINT = 1114112

A value that is outside the valid range of unicode code points.

Public fields

data

public byte[] data

position

public int position

Public constructors

ParsableByteArray

public ParsableByteArray()

Creates a new instance that initially has no backing data.

ParsableByteArray

public ParsableByteArray(byte[] data)

Creates a new instance wrapping data, and sets the limit to data.length.

Parameters
`byte[] data`	The array to wrap.

ParsableByteArray

public ParsableByteArray(int limit)

Creates a new instance with limit bytes and sets the limit.

Parameters
`int limit`	The limit to set.

ParsableByteArray

public ParsableByteArray(byte[] data, int limit)

Creates a new instance that wraps an existing array.

Parameters
`byte[] data`	The data to wrap.
`int limit`	The limit to set.

Public methods

bytesLeft

public int bytesLeft()

Returns the number of bytes yet to be read.

capacity

public int capacity()

Returns the capacity of the array, which may be larger than the limit.

ensureCapacity

public void ensureCapacity(int requiredCapacity)

Ensures the backing array is at least requiredCapacity long.

position, limit, and all data in the underlying array (including that beyond limit) are preserved.

This might replace or wipe the underlying array, potentially invalidating any local references.

getData

public byte[] getData()

Returns the underlying array.

Changes to this array are reflected in the results of the read...() methods.

This reference must be assumed to become invalid when reset or ensureCapacity are called (because the array might get reallocated).

getPosition

public int getPosition()

Returns the current offset in the array, in bytes.

limit

public int limit()

Returns the limit.

peekChar

public char peekChar()

Peeks at the next two bytes and interprets them as a big-endian char.

peekChar

public char peekChar(Charset charset)

peekCodePoint

public int peekCodePoint(Charset charset)

Peeks at the code point starting at getPosition as interpreted by charset.

The exact behaviour depends on charset:

US_ASCII: Returns the byte at getPosition if it's valid ASCII (less than 0x80), otherwise returns INVALID_CODE_POINT.
UTF-8: If getPosition is the start of a UTF-8 code unit the whole unit is decoded and returned. Otherwise INVALID_CODE_POINT is returned.
UTF-16 (all endian-nesses):
- If getPosition is at the start of a high surrogate code unit and the following two bytes are a isLowSurrogate low surrogate} code unit, the combined code point is returned.
- Otherwise the single code unit starting at getPosition is returned directly.
- UTF-16 has no support for byte-level synchronization, so if getPosition is not aligned with the start of a UTF-16 code unit then the result is undefined.

Throws
`java.lang.IllegalArgumentException`	if charset is not supported. Only US_ASCII, UTF-8, UTF-16, UTF-16BE, and UTF-16LE are supported.
`java.lang.IndexOutOfBoundsException`	if `bytesLeft` doesn't allow reading the smallest code unit in `charset` (1 byte for ASCII and UTF-8, 2 bytes for UTF-16).

peekInt

public int peekInt()

Peeks the next four bytes as a signed value.

peekUnsignedByte

public int peekUnsignedByte()

Peeks at the next byte as an unsigned value.

peekUnsignedInt24

public int peekUnsignedInt24()

Peeks the next three bytes as an unsigned value.

readBytes

public void readBytes(ParsableBitArray bitArray, int length)

Reads the next length bytes into bitArray, and resets the position of bitArray to zero.

Parameters
`ParsableBitArray bitArray`	The `ParsableBitArray` into which the bytes should be read.
`int length`	The number of bytes to write.

readBytes

public void readBytes(ByteBuffer buffer, int length)

Reads the next length bytes into buffer.

Parameters
`ByteBuffer buffer`	The `ByteBuffer` into which the read data should be written.
`int length`	The number of bytes to read.

See also
`put`

readBytes

public void readBytes(byte[] buffer, int offset, int length)

Reads the next length bytes into buffer at offset.

Parameters
`byte[] buffer`	The array into which the read data should be written.
`int offset`	The offset in `buffer` at which the read data should be written.
`int length`	The number of bytes to read.

See also
`arraycopy`

readDelimiterTerminatedString

public @Nullable String readDelimiterTerminatedString(char delimiter)

Reads up to the next delimiter byte (or the limit) as UTF-8 characters.

Returns
`@Nullable String`	The string not including any terminating delimiter byte, or null if the end of the data has already been reached.

readDouble

public double readDouble()

Reads the next eight bytes as a 64-bit floating point value.

readFloat

public float readFloat()

Reads the next four bytes as a 32-bit floating point value.

readInt

public int readInt()

Reads the next four bytes as a signed value

readInt24

public int readInt24()

Reads the next three bytes as a signed value.

readLine

public @Nullable String readLine()

Reads a line of text in UTF-8.

Equivalent to passing UTF_8 to readLine.

readLine

public @Nullable String readLine(Charset charset)

Reads a line of text in charset.

A line is considered to be terminated by any one of a carriage return ('\r'), a line feed ('\n'), or a carriage return followed immediately by a line feed ('\r\n'). This method discards leading UTF byte order marks (BOM), if present.

The position is advanced to start of the next line (i.e. any line terminators are skipped).

Parameters
`Charset charset`	The charset used to interpret the bytes as a `String`.

Returns
`@Nullable String`	The line not including any line-termination characters, or null if the end of the data has already been reached.

Throws
`java.lang.IllegalArgumentException`	if charset is not supported. Only US_ASCII, UTF-8, UTF-16, UTF-16BE, and UTF-16LE are supported.

readLittleEndianInt

public int readLittleEndianInt()

Reads the next four bytes as a signed value in little endian order.

readLittleEndianInt24

public int readLittleEndianInt24()

Reads the next three bytes as a signed value in little endian order.

readLittleEndianLong

public long readLittleEndianLong()

Reads the next eight bytes as a signed value in little endian order.

readLittleEndianShort

public short readLittleEndianShort()

Reads the next two bytes as a signed value.

readLittleEndianUnsignedInt

public long readLittleEndianUnsignedInt()

Reads the next four bytes as an unsigned value in little endian order.

readLittleEndianUnsignedInt24

public int readLittleEndianUnsignedInt24()

Reads the next three bytes as an unsigned value in little endian order.

readLittleEndianUnsignedIntToInt

public int readLittleEndianUnsignedIntToInt()

Reads the next four bytes as a little endian unsigned integer into an integer, if the top bit is a zero.

Throws
`java.lang.IllegalStateException`	Thrown if the top bit of the input data is set.

readLittleEndianUnsignedShort

public int readLittleEndianUnsignedShort()

Reads the next two bytes as an unsigned value.

readLong

public long readLong()

Reads the next eight bytes as a signed value.

readNullTerminatedString

public @Nullable String readNullTerminatedString()

Reads up to the next NUL byte (or the limit) as UTF-8 characters.

Returns
`@Nullable String`	The string not including any terminating NUL byte, or null if the end of the data has already been reached.

readNullTerminatedString

public String readNullTerminatedString(int length)

Reads the next length bytes as UTF-8 characters. A terminating NUL byte is discarded, if present.

Parameters
`int length`	The number of bytes to read.

Returns
`String`	The string, not including any terminating NUL byte.

readShort

public short readShort()

Reads the next two bytes as a signed value.

readString

public String readString(int length)

Reads the next length bytes as UTF-8 characters.

Parameters
`int length`	The number of bytes to read.

Returns
`String`	The string encoded by the bytes.

readString

public String readString(int length, Charset charset)

Reads the next length bytes as characters in the specified Charset.

Parameters
`int length`	The number of bytes to read.
`Charset charset`	The character set of the encoded characters.

Returns
`String`	The string encoded by the bytes in the specified character set.

readSynchSafeInt

public int readSynchSafeInt()

Reads a Synchsafe integer.

Synchsafe integers keep the highest bit of every byte zeroed. A 32 bit synchsafe integer can store 28 bits of information.

Returns
`int`	The parsed value.

readUnsignedByte

public int readUnsignedByte()

Reads the next byte as an unsigned value.

readUnsignedFixedPoint1616

public int readUnsignedFixedPoint1616()

Reads the next four bytes, returning the integer portion of the fixed point 16.16 integer.

readUnsignedInt

public long readUnsignedInt()

Reads the next four bytes as an unsigned value.

readUnsignedInt24

public int readUnsignedInt24()

Reads the next three bytes as an unsigned value.

readUnsignedIntToInt

public int readUnsignedIntToInt()

Reads the next four bytes as an unsigned integer into an integer, if the top bit is a zero.

Throws
`java.lang.IllegalStateException`	Thrown if the top bit of the input data is set.

readUnsignedLeb128ToInt

public int readUnsignedLeb128ToInt()

Reads an unsigned variable-length LEB128 value into an int.

Returns
`int`	integer value

Throws
`java.lang.IllegalArgumentException`	if the read value is greater than `MAX_VALUE` or less than `MIN_VALUE`

readUnsignedLeb128ToLong

public long readUnsignedLeb128ToLong()

Reads an unsigned variable-length LEB128 value into a long.

Returns
`long`	long value

Throws
`java.lang.IllegalStateException`	if the byte to be read is over the limit of the parsable byte array

readUnsignedLongToLong

public long readUnsignedLongToLong()

Reads the next eight bytes as an unsigned long into a long, if the top bit is a zero.

Throws
`java.lang.IllegalStateException`	Thrown if the top bit of the input data is set.

readUnsignedShort

public int readUnsignedShort()

Reads the next two bytes as an unsigned value.

readUtf8EncodedLong

public long readUtf8EncodedLong()

Reads a long value encoded by UTF-8 encoding

Returns
`long`	Decoded long value

Throws
`java.lang.NumberFormatException`	if there is a problem with decoding

readUtfCharsetFromBom

public @Nullable Charset readUtfCharsetFromBom()

Reads a UTF byte order mark (BOM) and returns the UTF Charset it represents. Returns null without advancing position if no BOM is found.

reset

public void reset(byte[] data)

Updates the instance to wrap data, and resets the position to zero and the limit to data.length.

Parameters
`byte[] data`	The array to wrap.

reset

public void reset(int limit)

Resets the position to zero and the limit to the specified value. This might replace or wipe the underlying array, potentially invalidating any local references.

Parameters
`int limit`	The limit to set.

reset

public void reset(byte[] data, int limit)

Updates the instance to wrap data, and resets the position to zero.

Parameters
`byte[] data`	The array to wrap.
`int limit`	The limit to set.

setLimit

public void setLimit(int limit)

Sets the limit.

Parameters
`int limit`	The limit to set.

setPosition

public void setPosition(int position)

Sets the reading offset in the array.

Parameters
`int position`	Byte offset in the array from which to read.

Throws
`java.lang.IllegalArgumentException`	Thrown if the new position is neither in nor at the end of the array.

setShouldEnforceLimitOnLegacyMethods

@VisibleForTesting
public static void setShouldEnforceLimitOnLegacyMethods(boolean enforceLimit)

Sets whether all read/peek methods should enforce that getPosition never exceeds limit.

Setting this to true in tests can help catch cases of accidentally reading beyond limit but still within the bounds of the underlying getData.

Some (newer) methods will always enforce the invariant, even when this is set to false.

Defaults to false (this may change in a later release).

skipBytes

public void skipBytes(int bytes)

Moves the reading offset by bytes.

Parameters
`int bytes`	The number of bytes to skip.

Throws
`java.lang.IllegalArgumentException`	Thrown if the new position is neither in nor at the end of the array.

skipLeb128

public void skipLeb128()

Skips a variable-length LEB128 value.