ParsableByteArray
@UnstableApi
@CheckReturnValue
public final class ParsableByteArray
Wraps a byte array, providing a set of methods for parsing data from it. Numerical values are parsed with the assumption that their constituent bytes are in big endian order.
Summary
Constants |
|
|---|---|
static final int |
INVALID_CODE_POINT = 1114112A value that is outside the valid range of unicode code points. |
Public constructors |
|---|
|
Creates a new instance that initially has no backing data. |
ParsableByteArray(byte[] data)Creates a new instance wrapping |
ParsableByteArray(int limit)Creates a new instance with |
ParsableByteArray(byte[] data, int limit)Creates a new instance that wraps an existing array. |
Public methods |
|
|---|---|
int |
Returns the number of bytes yet to be read. |
int |
capacity()Returns the capacity of the array, which may be larger than the limit. |
void |
ensureCapacity(int requiredCapacity)Ensures the backing array is at least |
byte[] |
getData()Returns the underlying array. |
int |
Returns the current offset in the array, in bytes. |
int |
limit()Returns the limit. |
char |
peekChar()Peeks at the next two bytes and interprets them as a big-endian char. |
char |
This method is deprecated. Either use |
int |
peekCodePoint(Charset charset)Peeks at the code point starting at |
int |
peekInt()Peeks the next four bytes as a signed value. |
int |
Peeks at the next byte as an unsigned value. |
int |
Peeks the next three bytes as an unsigned value. |
void |
readBytes(ParsableBitArray bitArray, int length)Reads the next |
void |
readBytes(ByteBuffer buffer, int length)Reads the next |
void |
readBytes(byte[] buffer, int offset, int length)Reads the next |
@Nullable String |
readDelimiterTerminatedString(char delimiter)Reads up to the next delimiter byte (or the limit) as UTF-8 characters. |
double |
Reads the next eight bytes as a 64-bit floating point value. |
float |
Reads the next four bytes as a 32-bit floating point value. |
int |
readInt()Reads the next four bytes as a signed value |
int |
Reads the next three bytes as a signed value. |
@Nullable String |
readLine()Reads a line of text in UTF-8. |
@Nullable String |
Reads a line of text in |
int |
Reads the next four bytes as a signed value in little endian order. |
int |
Reads the next three bytes as a signed value in little endian order. |
long |
Reads the next eight bytes as a signed value in little endian order. |
short |
Reads the next two bytes as a signed value. |
long |
Reads the next four bytes as an unsigned value in little endian order. |
int |
Reads the next three bytes as an unsigned value in little endian order. |
int |
Reads the next four bytes as a little endian unsigned integer into an integer, if the top bit is a zero. |
int |
Reads the next two bytes as an unsigned value. |
long |
readLong()Reads the next eight bytes as a signed value. |
@Nullable String |
Reads up to the next NUL byte (or the limit) as UTF-8 characters. |
String |
readNullTerminatedString(int length)Reads the next |
short |
Reads the next two bytes as a signed value. |
String |
readString(int length)Reads the next |
String |
readString(int length, Charset charset)Reads the next |
int |
Reads a Synchsafe integer. |
int |
Reads the next byte as an unsigned value. |
int |
Reads the next four bytes, returning the integer portion of the fixed point 16.16 integer. |
long |
Reads the next four bytes as an unsigned value. |
int |
Reads the next three bytes as an unsigned value. |
int |
Reads the next four bytes as an unsigned integer into an integer, if the top bit is a zero. |
int |
Reads an unsigned variable-length LEB128 value into an int. |
long |
Reads an unsigned variable-length LEB128 value into a long. |
long |
Reads the next eight bytes as an unsigned long into a long, if the top bit is a zero. |
int |
Reads the next two bytes as an unsigned value. |
long |
Reads a long value encoded by UTF-8 encoding |
@Nullable Charset |
Reads a UTF byte order mark (BOM) and returns the UTF |
void |
reset(byte[] data)Updates the instance to wrap |
void |
reset(int limit)Resets the position to zero and the limit to the specified value. |
void |
reset(byte[] data, int limit)Updates the instance to wrap |
void |
setLimit(int limit)Sets the limit. |
void |
setPosition(int position)Sets the reading offset in the array. |
static void |
@VisibleForTestingSets whether all read/peek methods should enforce that |
void |
skipBytes(int bytes)Moves the reading offset by |
void |
Skips a variable-length LEB128 value. |
Constants
INVALID_CODE_POINT
public static final int INVALID_CODE_POINT = 1114112
A value that is outside the valid range of unicode code points.
Public constructors
ParsableByteArray
public ParsableByteArray()
Creates a new instance that initially has no backing data.
ParsableByteArray
public ParsableByteArray(byte[] data)
Creates a new instance wrapping data, and sets the limit to data.length.
| Parameters | |
|---|---|
byte[] data |
The array to wrap. |
ParsableByteArray
public ParsableByteArray(int limit)
Creates a new instance with limit bytes and sets the limit.
| Parameters | |
|---|---|
int limit |
The limit to set. |
ParsableByteArray
public ParsableByteArray(byte[] data, int limit)
Creates a new instance that wraps an existing array.
| Parameters | |
|---|---|
byte[] data |
The data to wrap. |
int limit |
The limit to set. |
Public methods
capacity
public int capacity()
Returns the capacity of the array, which may be larger than the limit.
ensureCapacity
public void ensureCapacity(int requiredCapacity)
Ensures the backing array is at least requiredCapacity long.
position, limit, and all data in the underlying array (including that beyond limit) are preserved.
This might replace or wipe the underlying array, potentially invalidating any local references.
getData
public byte[] getData()
Returns the underlying array.
Changes to this array are reflected in the results of the read...() methods.
This reference must be assumed to become invalid when reset or ensureCapacity are called (because the array might get reallocated).
peekChar
public char peekChar()
Peeks at the next two bytes and interprets them as a big-endian char.
peekCodePoint
public int peekCodePoint(Charset charset)
Peeks at the code point starting at getPosition as interpreted by charset.
The exact behaviour depends on charset:
- US_ASCII: Returns the byte at
getPositionif it's valid ASCII (less than0x80), otherwise returnsINVALID_CODE_POINT. - UTF-8: If
getPositionis the start of a UTF-8 code unit the whole unit is decoded and returned. OtherwiseINVALID_CODE_POINTis returned. - UTF-16 (all endian-nesses):
- If
getPositionis at the start of ahigh surrogatecode unit and the following two bytes are aisLowSurrogatelow surrogate} code unit, thecombined code pointis returned. - Otherwise the single code unit starting at
getPositionis returned directly. - UTF-16 has no support for byte-level synchronization, so if
getPositionis not aligned with the start of a UTF-16 code unit then the result is undefined.
- If
| Throws | |
|---|---|
java.lang.IllegalArgumentException |
if charset is not supported. Only US_ASCII, UTF-8, UTF-16, UTF-16BE, and UTF-16LE are supported. |
java.lang.IndexOutOfBoundsException |
if |
readBytes
public void readBytes(ParsableBitArray bitArray, int length)
Reads the next length bytes into bitArray, and resets the position of
bitArray to zero.
| Parameters | |
|---|---|
ParsableBitArray bitArray |
The |
int length |
The number of bytes to write. |
readBytes
public void readBytes(ByteBuffer buffer, int length)
Reads the next length bytes into buffer.
| Parameters | |
|---|---|
ByteBuffer buffer |
The |
int length |
The number of bytes to read. |
| See also | |
|---|---|
put |
readBytes
public void readBytes(byte[] buffer, int offset, int length)
Reads the next length bytes into buffer at offset.
| Parameters | |
|---|---|
byte[] buffer |
The array into which the read data should be written. |
int offset |
The offset in |
int length |
The number of bytes to read. |
| See also | |
|---|---|
arraycopy |
readDelimiterTerminatedString
public @Nullable String readDelimiterTerminatedString(char delimiter)
Reads up to the next delimiter byte (or the limit) as UTF-8 characters.
readLine
public @Nullable String readLine(Charset charset)
Reads a line of text in charset.
A line is considered to be terminated by any one of a carriage return ('\r'), a line feed ('\n'), or a carriage return followed immediately by a line feed ('\r\n'). This method discards leading UTF byte order marks (BOM), if present.
The position is advanced to start of the next line (i.e. any line terminators are skipped).
| Returns | |
|---|---|
@Nullable String |
The line not including any line-termination characters, or null if the end of the data has already been reached. |
| Throws | |
|---|---|
java.lang.IllegalArgumentException |
if charset is not supported. Only US_ASCII, UTF-8, UTF-16, UTF-16BE, and UTF-16LE are supported. |
readLittleEndianInt
public int readLittleEndianInt()
Reads the next four bytes as a signed value in little endian order.
readLittleEndianInt24
public int readLittleEndianInt24()
Reads the next three bytes as a signed value in little endian order.
readLittleEndianLong
public long readLittleEndianLong()
Reads the next eight bytes as a signed value in little endian order.
readLittleEndianShort
public short readLittleEndianShort()
Reads the next two bytes as a signed value.
readLittleEndianUnsignedInt
public long readLittleEndianUnsignedInt()
Reads the next four bytes as an unsigned value in little endian order.
readLittleEndianUnsignedInt24
public int readLittleEndianUnsignedInt24()
Reads the next three bytes as an unsigned value in little endian order.
readLittleEndianUnsignedIntToInt
public int readLittleEndianUnsignedIntToInt()
Reads the next four bytes as a little endian unsigned integer into an integer, if the top bit is a zero.
| Throws | |
|---|---|
java.lang.IllegalStateException |
Thrown if the top bit of the input data is set. |
readLittleEndianUnsignedShort
public int readLittleEndianUnsignedShort()
Reads the next two bytes as an unsigned value.
readNullTerminatedString
public @Nullable String readNullTerminatedString()
Reads up to the next NUL byte (or the limit) as UTF-8 characters.
readNullTerminatedString
public String readNullTerminatedString(int length)
Reads the next length bytes as UTF-8 characters. A terminating NUL byte is discarded, if present.
| Parameters | |
|---|---|
int length |
The number of bytes to read. |
| Returns | |
|---|---|
String |
The string, not including any terminating NUL byte. |
readString
public String readString(int length)
Reads the next length bytes as UTF-8 characters.
| Parameters | |
|---|---|
int length |
The number of bytes to read. |
| Returns | |
|---|---|
String |
The string encoded by the bytes. |
readString
public String readString(int length, Charset charset)
Reads the next length bytes as characters in the specified Charset.
| Parameters | |
|---|---|
int length |
The number of bytes to read. |
Charset charset |
The character set of the encoded characters. |
| Returns | |
|---|---|
String |
The string encoded by the bytes in the specified character set. |
readSynchSafeInt
public int readSynchSafeInt()
Reads a Synchsafe integer.
Synchsafe integers keep the highest bit of every byte zeroed. A 32 bit synchsafe integer can store 28 bits of information.
| Returns | |
|---|---|
int |
The parsed value. |
readUnsignedFixedPoint1616
public int readUnsignedFixedPoint1616()
Reads the next four bytes, returning the integer portion of the fixed point 16.16 integer.
readUnsignedIntToInt
public int readUnsignedIntToInt()
Reads the next four bytes as an unsigned integer into an integer, if the top bit is a zero.
| Throws | |
|---|---|
java.lang.IllegalStateException |
Thrown if the top bit of the input data is set. |
readUnsignedLeb128ToInt
public int readUnsignedLeb128ToInt()
Reads an unsigned variable-length LEB128 value into an int.
| Returns | |
|---|---|
int |
integer value |
| Throws | |
|---|---|
java.lang.IllegalArgumentException |
if the read value is greater than |
readUnsignedLeb128ToLong
public long readUnsignedLeb128ToLong()
Reads an unsigned variable-length LEB128 value into a long.
| Returns | |
|---|---|
long |
long value |
| Throws | |
|---|---|
java.lang.IllegalStateException |
if the byte to be read is over the limit of the parsable byte array |
readUnsignedLongToLong
public long readUnsignedLongToLong()
Reads the next eight bytes as an unsigned long into a long, if the top bit is a zero.
| Throws | |
|---|---|
java.lang.IllegalStateException |
Thrown if the top bit of the input data is set. |
readUtf8EncodedLong
public long readUtf8EncodedLong()
Reads a long value encoded by UTF-8 encoding
| Returns | |
|---|---|
long |
Decoded long value |
| Throws | |
|---|---|
java.lang.NumberFormatException |
if there is a problem with decoding |
readUtfCharsetFromBom
public @Nullable Charset readUtfCharsetFromBom()
Reads a UTF byte order mark (BOM) and returns the UTF Charset it represents. Returns null without advancing position if no BOM is found.
reset
public void reset(byte[] data)
Updates the instance to wrap data, and resets the position to zero and the limit to data.length.
| Parameters | |
|---|---|
byte[] data |
The array to wrap. |
reset
public void reset(int limit)
Resets the position to zero and the limit to the specified value. This might replace or wipe the underlying array, potentially invalidating any local references.
| Parameters | |
|---|---|
int limit |
The limit to set. |
reset
public void reset(byte[] data, int limit)
Updates the instance to wrap data, and resets the position to zero.
| Parameters | |
|---|---|
byte[] data |
The array to wrap. |
int limit |
The limit to set. |
setPosition
public void setPosition(int position)
Sets the reading offset in the array.
| Parameters | |
|---|---|
int position |
Byte offset in the array from which to read. |
| Throws | |
|---|---|
java.lang.IllegalArgumentException |
Thrown if the new position is neither in nor at the end of the array. |
setShouldEnforceLimitOnLegacyMethods
@VisibleForTesting
public static void setShouldEnforceLimitOnLegacyMethods(boolean enforceLimit)
Sets whether all read/peek methods should enforce that getPosition never exceeds limit.
Setting this to true in tests can help catch cases of accidentally reading beyond limit but still within the bounds of the underlying getData.
Some (newer) methods will always enforce the invariant, even when this is set to
false.
Defaults to false (this may change in a later release).
skipBytes
public void skipBytes(int bytes)
Moves the reading offset by bytes.
| Parameters | |
|---|---|
int bytes |
The number of bytes to skip. |
| Throws | |
|---|---|
java.lang.IllegalArgumentException |
Thrown if the new position is neither in nor at the end of the array. |