Docjar: A Java Source and Docuemnt Enginecom.*    java.*    javax.*    org.*    all    new    plug-in

Quick Search    Search Deep

org.htmlparser.util
Class Translate  view Translate download Translate.java

java.lang.Object
  extended byorg.htmlparser.util.Translate

public class Translate
extends java.lang.Object

Translate numeric character references and character entity references to unicode characters. Based on tables found at http://www.w3.org/TR/REC-html40/sgml/entities.html

Note: Do not edit! This class is created by the Generate class.

Typical usage:

 String s = Translate.decode(getTextFromHtmlPage());
 


Field Summary
protected static java.util.Map charRefTable
          Table mapping character to entity reference kernel.
protected static java.util.Map refChar
          Table mapping entity reference kernel to character.
 
Constructor Summary
private Translate()
          Private constructor.
 
Method Summary
static char convertToChar(java.lang.String string)
          Convert a reference to a unicode character.
static java.lang.String convertToString(java.lang.Character character)
          Convert a character to a character entity reference.
static java.lang.String convertToString(int character)
          Convert a character to a numeric character reference.
static java.lang.String decode(java.lang.String string)
          Decode a string containing references.
static java.lang.String encode(java.lang.String string)
          Encode a string to use references.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

refChar

protected static java.util.Map refChar
Table mapping entity reference kernel to character.

String->Character


charRefTable

protected static java.util.Map charRefTable
Table mapping character to entity reference kernel.

Character->String

Constructor Detail

Translate

private Translate()
Private constructor. This class is fully static and thread safe.

Method Detail

convertToChar

public static char convertToChar(java.lang.String string)
Convert a reference to a unicode character. Convert a single numeric character reference or character entity reference to a unicode character.


decode

public static java.lang.String decode(java.lang.String string)
Decode a string containing references. Change all numeric character reference and character entity references to unicode characters.


convertToString

public static java.lang.String convertToString(java.lang.Character character)
Convert a character to a character entity reference. Convert a unicode character to a character entity reference of the form &xxxx;.


convertToString

public static java.lang.String convertToString(int character)
Convert a character to a numeric character reference. Convert a unicode character to a numeric character reference of the form &#xxxx;.


encode

public static java.lang.String encode(java.lang.String string)
Encode a string to use references. Change all characters that are not ASCII to their numeric character reference or character entity reference. This implementation is inefficient, allocating a new Character for each character in the string, but this class is primarily intended to decode strings so efficiency and speed in the encoding was not a priority.