A converter that can convert a byte sequence from a charset into a 16-bit
Unicode character sequence.
There are two common decoding errors. One is named malformed and it is
returned when the input byte sequence is illegal for the current specific
charset, the other is named unmappable character and it is returned when a
problem occurs mapping a legal input byte sequence to its Unicode character
equivalent.
Both errors can be handled in three ways, the default one is to report the
error to the invoker by a CoderResult instance, and the
alternatives are to ignore it or to replace the erroneous input with the
replacement string. The replacement string is "\uFFFD" by default and can be
changed by invoking replaceWith method. The
invoker of this decoder can choose one way by specifying a
CodingErrorAction instance for each error type via
onMalformedInput method and
onUnmappableCharacter
method.
This is an abstract class and encapsulates many common operations of the
decoding process for all charsets. Decoders for a specific charset should
extend this class and need only to implement the
decodeLoop method for the basic
decoding. If a subclass maintains an internal state, it should override the
implFlush method and the
implReset method in addition.
This class is not thread-safe.
Method from java.nio.charset.CharsetDecoder Detail: |
public final float averageCharsPerByte() {
return averChars;
}
Gets the average number of characters created by this decoder for a
single input byte. |
public final Charset charset() {
return cs;
}
Gets the Charset which this decoder uses. |
public final CharBuffer decode(ByteBuffer in) throws CharacterCodingException {
reset();
int length = (int) (in.remaining() * averChars);
CharBuffer output = CharBuffer.allocate(length);
CoderResult result = null;
while (true) {
result = decode(in, output, false);
checkCoderResult(result);
if (result.isUnderflow()) {
break;
} else if (result.isOverflow()) {
output = allocateMore(output);
}
}
result = decode(in, output, true);
checkCoderResult(result);
while (true) {
result = flush(output);
checkCoderResult(result);
if (result.isOverflow()) {
output = allocateMore(output);
} else {
break;
}
}
output.flip();
status = FLUSH;
return output;
}
This is a facade method for the decoding operation.
This method decodes the remaining byte sequence of the given byte buffer
into a new character buffer. This method performs a complete decoding
operation, resets at first, then decodes, and flushes at last.
This method should not be invoked while another {@code decode} operation
is ongoing. |
public final CoderResult decode(ByteBuffer in,
CharBuffer out,
boolean endOfInput) {
/*
* status check
*/
if ((status == FLUSH) || (!endOfInput && status == END)) {
throw new IllegalStateException();
}
CoderResult result = null;
// begin to decode
while (true) {
CodingErrorAction action = null;
try {
result = decodeLoop(in, out);
} catch (BufferOverflowException ex) {
// unexpected exception
throw new CoderMalfunctionError(ex);
} catch (BufferUnderflowException ex) {
// unexpected exception
throw new CoderMalfunctionError(ex);
}
/*
* result handling
*/
if (result.isUnderflow()) {
int remaining = in.remaining();
status = endOfInput ? END : ONGOING;
if (endOfInput && remaining > 0) {
result = CoderResult.malformedForLength(remaining);
} else {
return result;
}
}
if (result.isOverflow()) {
return result;
}
// set coding error handle action
action = malformAction;
if (result.isUnmappable()) {
action = unmapAction;
}
// If the action is IGNORE or REPLACE, we should continue decoding.
if (action == CodingErrorAction.REPLACE) {
if (out.remaining() < replace.length()) {
return CoderResult.OVERFLOW;
}
out.put(replace);
} else {
if (action != CodingErrorAction.IGNORE)
return result;
}
in.position(in.position() + result.length());
}
}
Decodes bytes starting at the current position of the given input buffer,
and writes the equivalent character sequence into the given output buffer
from its current position.
The buffers' position will be changed with the reading and writing
operation, but their limits and marks will be kept intact.
A CoderResult instance will be returned according to
following rules:
- CoderResult.OVERFLOW indicates that
even though not all of the input has been processed, the buffer the
output is being written to has reached its capacity. In the event of this
code being returned this method should be called once more with an
out argument that has not already been filled.
- CoderResult.UNDERFLOW indicates that
as many bytes as possible in the input buffer have been decoded. If there
is no further input and no remaining bytes in the input buffer then this
operation may be regarded as complete. Otherwise, this method should be
called once more with additional input.
- A malformed input result
indicates that some malformed input error has been encountered, and the
erroneous bytes start at the input buffer's position and their number can
be got by result's length . This kind of
result can be returned only if the malformed action is
CodingErrorAction.REPORT .
- A unmappable character
result indicates that some unmappable character error has been
encountered, and the erroneous bytes start at the input buffer's position
and their number can be got by result's
length . This kind of result can be returned
only if the unmappable character action is
CodingErrorAction.REPORT .
The endOfInput parameter indicates that the invoker cannot
provide further input. This parameter is true if and only if the bytes in
current input buffer are all inputs for this decoding operation. Note
that it is common and won't cause an error if the invoker sets false and
then can't provide more input, while it may cause an error if the invoker
always sets true in several consecutive invocations. This would make the
remaining input to be treated as malformed input.
This method invokes the
decodeLoop method to
implement the basic decode logic for a specific charset. |
abstract protected CoderResult decodeLoop(ByteBuffer in,
CharBuffer out)
Decodes bytes into characters. This method is called by the
decode method.
This method will implement the essential decoding operation, and it won't
stop decoding until either all the input bytes are read, the output
buffer is filled, or some exception is encountered. Then it will return a
CoderResult object indicating the result of current
decoding operation. The rules to construct the CoderResult
are the same as for
decode . When an
exception is encountered in the decoding operation, most implementations
of this method will return a relevant result object to the
decode method, and some
performance optimized implementation may handle the exception and
implement the error action itself.
The buffers are scanned from their current positions, and their positions
will be modified accordingly, while their marks and limits will be
intact. At most in.remaining() characters
will be read, and out.remaining() bytes
will be written.
Note that some implementations may pre-scan the input buffer and return a
CoderResult.UNDERFLOW until it receives sufficient input. |
public Charset detectedCharset() {
throw new UnsupportedOperationException();
}
Gets the charset detected by this decoder; this method is optional.
If implementing an auto-detecting charset, then this decoder returns the
detected charset from this method when it is available. The returned
charset will be the same for the rest of the decode operation.
If insufficient bytes have been read to determine the charset, an
IllegalStateException will be thrown.
The default implementation always throws
UnsupportedOperationException , so it should be overridden
by a subclass if needed. |
public final CoderResult flush(CharBuffer out) {
if (status != END && status != INIT) {
throw new IllegalStateException();
}
CoderResult result = implFlush(out);
if (result == CoderResult.UNDERFLOW) {
status = FLUSH;
}
return result;
}
Flushes this decoder.
This method will call implFlush . Some
decoders may need to write some characters to the output buffer when they
have read all input bytes; subclasses can override
implFlush to perform the writing operation.
The maximum number of written bytes won't be larger than
out.remaining() . If some decoder wants to
write more bytes than an output buffer's remaining space allows, then a
CoderResult.OVERFLOW will be returned, and this method
must be called again with a character buffer that has more remaining
space. Otherwise this method will return
CoderResult.UNDERFLOW , which means one decoding process
has been completed successfully.
During the flush, the output buffer's position will be changed
accordingly, while its mark and limit will be intact. |
protected CoderResult implFlush(CharBuffer out) {
return CoderResult.UNDERFLOW;
}
Flushes this decoder. The default implementation does nothing and always
returns CoderResult.UNDERFLOW ; this method can be
overridden if needed. |
protected void implOnMalformedInput(CodingErrorAction newAction) {
// default implementation is empty
}
Notifies that this decoder's CodingErrorAction specified
for malformed input error has been changed. The default implementation
does nothing; this method can be overridden if needed. |
protected void implOnUnmappableCharacter(CodingErrorAction newAction) {
// default implementation is empty
}
Notifies that this decoder's CodingErrorAction specified
for unmappable character error has been changed. The default
implementation does nothing; this method can be overridden if needed. |
protected void implReplaceWith(String newReplacement) {
// default implementation is empty
}
Notifies that this decoder's replacement has been changed. The default
implementation does nothing; this method can be overridden if needed. |
protected void implReset() {
// default implementation is empty
}
Reset this decoder's charset related state. The default implementation
does nothing; this method can be overridden if needed. |
public boolean isAutoDetecting() {
return false;
}
Indicates whether this decoder implements an auto-detecting charset. |
public boolean isCharsetDetected() {
throw new UnsupportedOperationException();
}
Indicates whether this decoder has detected a charset; this method is
optional.
If this decoder implements an auto-detecting charset, then this method
may start to return true during decoding operation to indicate that a
charset has been detected in the input bytes and that the charset can be
retrieved by invoking the detectedCharset
method.
Note that a decoder that implements an auto-detecting charset may still
succeed in decoding a portion of the given input even when it is unable
to detect the charset. For this reason users should be aware that a
false return value does not indicate that no decoding took
place.
The default implementation always throws an
UnsupportedOperationException ; it should be overridden by
a subclass if needed. |
public CodingErrorAction malformedInputAction() {
return malformAction;
}
Gets this decoder's CodingErrorAction when malformed input
occurred during the decoding process. |
public final float maxCharsPerByte() {
return maxChars;
}
Gets the maximum number of characters which can be created by this
decoder for one input byte, must be positive. |
public final CharsetDecoder onMalformedInput(CodingErrorAction newAction) {
if (null == newAction) {
throw new IllegalArgumentException();
}
malformAction = newAction;
implOnMalformedInput(newAction);
return this;
}
Sets this decoder's action on malformed input errors.
This method will call the
implOnMalformedInput
method with the given new action as argument. |
public final CharsetDecoder onUnmappableCharacter(CodingErrorAction newAction) {
if (null == newAction) {
throw new IllegalArgumentException();
}
unmapAction = newAction;
implOnUnmappableCharacter(newAction);
return this;
}
Sets this decoder's action on unmappable character errors.
This method will call the
implOnUnmappableCharacter
method with the given new action as argument. |
public final CharsetDecoder replaceWith(String newReplacement) {
if (null == newReplacement || newReplacement.length() == 0) {
// niochar.06=Replacement string cannot be null or empty.
throw new IllegalArgumentException(Messages.getString("niochar.06")); //$NON-NLS-1$
}
if (newReplacement.length() > maxChars) {
// niochar.07=Replacement string's length cannot be larger than max
// characters per byte.
throw new IllegalArgumentException(Messages.getString("niochar.07")); //$NON-NLS-1$
}
replace = newReplacement;
implReplaceWith(newReplacement);
return this;
}
Sets the new replacement string.
This method first checks the given replacement's validity, then changes
the replacement value, and at last calls the
implReplaceWith method with the given
new replacement as argument. |
public final String replacement() {
return replace;
}
Gets the replacement string, which is never null or empty. |
public final CharsetDecoder reset() {
status = INIT;
implReset();
return this;
}
Resets this decoder. This method will reset the internal status, and then
calls implReset() to reset any status related to the
specific charset. |
public CodingErrorAction unmappableCharacterAction() {
return unmapAction;
}
Gets this decoder's CodingErrorAction when an unmappable
character error occurred during the decoding process. |