com.isogen.i18nsupport
Class I18nUtil

java.lang.Object
  extended bycom.isogen.i18nsupport.I18nUtil

public class I18nUtil
extends java.lang.Object

Version:
$Revision: 1.5 $ Provides utility functions needed by other i18n support classes. If your documents to be processed use an attribute other than xml:lang= to indicate their national language, use the System property "com.isogen.i18n.langAttName" to specify the name of the attribute to use. This approach assumes that for a given run all the elements use the same attribute name, which would almost always be the case.

Constructor Summary
I18nUtil()
           
 
Method Summary
static java.lang.String byteToHex(byte b)
          Converts a byte to the string representation of its hex value.
static java.lang.String charToHex(char c)
          Returns hex String representation of char c, that is, the hex digits of the Unicode code point for the character.
static java.lang.String echoEndTag(org.w3c.dom.Element elem)
          Given an element, returns the end tag as a string.
static java.lang.String echoStartTag(org.w3c.dom.Element elem)
          Given an element, returns the start tag as a string, including any attributes.
static java.lang.String escapeUnicodeString(java.lang.String inString)
          Given a string containing non-ASCII Unicode characters, returns the same string will all non-ASCII characters replaced with "\\uxxxx" reflecting their Unicode code points.
static org.w3c.dom.Element getAttHolder(org.w3c.dom.Element startNode, java.lang.String attName)
          Returns element that exhibits the specified attribute, walking up the element hierarchy.
static org.w3c.dom.Element getElement(org.w3c.dom.Element parentElem, java.lang.String tagName)
          Returns the element with the specified tag name.
static java.lang.String getElementContent(org.w3c.dom.Element elem)
          Returns the string content of an element (e.g., xsl:value-of()).
static java.lang.String getElementContentNormalized(org.w3c.dom.Element elem)
          Returns the string content of an element with newlines normalized to single space characters.
static java.lang.String getElementLanguage(org.w3c.dom.Element elemNode, java.lang.String defaultLangCode)
          Returns the language code associated with the specified element.
static org.w3c.dom.Element getFirstElementChild(org.w3c.dom.Element elemNode)
          Returns the first element node within the children of the specified element.
static int getIntForHexChar(char hexChar)
          Returns the int value of a character that is a hex digit
static java.lang.String getLangAttName()
          Returns the value of the langAttName property.
static java.util.Locale getLocaleFromLangCode(java.lang.String langCode)
          Given a "language" code consisting of an ISO 639 two-character language code and, optionally, an ISO 3166 country code, separated by a hyphen (e.g, "ar", "zh-CN"), returns the built-in (to Java) Locale with the matching language and country code.
static boolean hasElementChildren(org.w3c.dom.Element elemNode)
          Returns true if the input element has element children.
static byte[] hexToBytes(java.lang.String hexString)
          Given a hex string ("A012EBCD"), returns the bytes it represents.
static void main(java.lang.String[] args)
           
static java.lang.String readUnicodeFile(java.io.File file, java.lang.String encoding)
          Reads the specified file as a Unicode string in the specified encoding.
static java.lang.String readUnicodeFile(java.lang.String filePath, java.lang.String encoding)
          Reads the file at the specified path as a Unicode string in the specified encoding.
static java.lang.String readUnicodeFile(java.net.URL fileUrl, java.lang.String encoding)
          Given the URl to a file in the specified encoding, returns a single string with the contents of that file.
static java.lang.String readUnicodeStream(java.io.InputStream is, java.lang.String encoding)
          Reads an InputStream as a Unicode string in the specified encoding.
static java.lang.String stripAngleBrackets(java.lang.String toStrip)
          Removes leading and trailing angle brackets from a string.
static void writeCollationRulesForLocale(java.util.Locale locale, java.lang.String outFilePath)
          Given a Java Locale object, constructs a RuleBasedCollator for the Locale, gets the collation rules, and writes them to a file.
static void writeIcuCollationRulesForLocale(java.util.Locale locale, java.lang.String outFilePath)
          Given a Java Locale object, constructs a RuleBasedCollator for the Locale, gets the collation rules, and writes them to a file.
static void writeUnicodeFile(java.lang.String outString, java.lang.String filePath, java.lang.String encoding)
          Writes a string to a file in the specified encoding.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

I18nUtil

public I18nUtil()
Method Detail

stripAngleBrackets

public static java.lang.String stripAngleBrackets(java.lang.String toStrip)
Removes leading and trailing angle brackets from a string.

Parameters:
toStrip - The string to be stripped.
Returns:
The string with the angle brackets removed.

hexToBytes

public static byte[] hexToBytes(java.lang.String hexString)
Given a hex string ("A012EBCD"), returns the bytes it represents.

Parameters:
hexString - A sequence of hex digit pairs.
Returns:
An array of the bytes specified by the hex string.

echoStartTag

public static java.lang.String echoStartTag(org.w3c.dom.Element elem)
Given an element, returns the start tag as a string, including any attributes. Used to generate markup strings from elements. NOTE: This implementation does not account for '"' characters within attribute values.

Parameters:
elem - The element node to be echoed.
Returns:
XML start tag.

echoEndTag

public static java.lang.String echoEndTag(org.w3c.dom.Element elem)
Given an element, returns the end tag as a string. Used to generate markup strings from elements.

Parameters:
elem - The element node to be echoed.
Returns:
XML end tag.

getAttHolder

public static org.w3c.dom.Element getAttHolder(org.w3c.dom.Element startNode,
                                               java.lang.String attName)
                                        throws I18nServiceError
Returns element that exhibits the specified attribute, walking up the element hierarchy.

Parameters:
startNode - The node to check first. It's ancestors will be interogated until the attribute is found or the root is reached.
attName - The name of the attribute to find.
Returns:
Returns the element or null if not found.
Throws:
I18nServiceError

getFirstElementChild

public static org.w3c.dom.Element getFirstElementChild(org.w3c.dom.Element elemNode)
Returns the first element node within the children of the specified element.

Parameters:
elemNode - The element whose first element child is to be returned.

hasElementChildren

public static boolean hasElementChildren(org.w3c.dom.Element elemNode)
Returns true if the input element has element children.

Parameters:
elemNode -

getElementContent

public static java.lang.String getElementContent(org.w3c.dom.Element elem)
Returns the string content of an element (e.g., xsl:value-of()).

Parameters:
elem - Element to get the value of.

getElementContentNormalized

public static java.lang.String getElementContentNormalized(org.w3c.dom.Element elem)
Returns the string content of an element with newlines normalized to single space characters.

Parameters:
elem - Element to get the value of.
Returns:
The normalized string value of the element.

getElementLanguage

public static java.lang.String getElementLanguage(org.w3c.dom.Element elemNode,
                                                  java.lang.String defaultLangCode)
Returns the language code associated with the specified element.

Parameters:
elemNode - The whose language value is to be returned.
defaultLangCode - The default language code to return if there is no explicit language code.

getElement

public static org.w3c.dom.Element getElement(org.w3c.dom.Element parentElem,
                                             java.lang.String tagName)
                                      throws I18nUtilError
Returns the element with the specified tag name. Throws an exception if element not found or if more than one found.

Throws:
I18nUtilError

getLocaleFromLangCode

public static java.util.Locale getLocaleFromLangCode(java.lang.String langCode)
                                              throws MissingLocaleException
Given a "language" code consisting of an ISO 639 two-character language code and, optionally, an ISO 3166 country code, separated by a hyphen (e.g, "ar", "zh-CN"), returns the built-in (to Java) Locale with the matching language and country code. If there is no such Locale, throws an exception. [The "langCode" is more accurately a locale code, but the name "langCode" is used throughout this library.] This method ensures that the Locale returned is one that is known to your Java installation.

Parameters:
langCode - The language and, optionally, country code for the desired locale.
Returns:
The Locale object for the specified language code.
Throws:
MissingLocaleException. - Note that the set of available locales is a function of how your Java installation is configured.
MissingLocaleException

getIntForHexChar

public static int getIntForHexChar(char hexChar)
Returns the int value of a character that is a hex digit

Parameters:
hexChar - The character to be processed, one of 0-9, A-F
Returns:
The int value. E.g., for "A" return 10.

escapeUnicodeString

public static java.lang.String escapeUnicodeString(java.lang.String inString)
Given a string containing non-ASCII Unicode characters, returns the same string will all non-ASCII characters replaced with "\\uxxxx" reflecting their Unicode code points. This method is useful for echoing arbitrary Unicode strings to ASCII-only environments or environments where not all characters may be accounted for by the font(s) in use.

Parameters:
inString - String to be processed.
Returns:
Escaped string.

byteToHex

public static java.lang.String byteToHex(byte b)
Converts a byte to the string representation of its hex value.

Parameters:
b - The byte to process.
Returns:
A string consisting of hex digits.

charToHex

public static java.lang.String charToHex(char c)
Returns hex String representation of char c, that is, the hex digits of the Unicode code point for the character.

Parameters:
c - Character to process.
Returns:
Hex string of the character's code point.

readUnicodeFile

public static java.lang.String readUnicodeFile(java.net.URL fileUrl,
                                               java.lang.String encoding)
                                        throws I18nUtilError
Given the URl to a file in the specified encoding, returns a single string with the contents of that file.

Parameters:
fileUrl - The URL of the file
encoding - The encoding name: UTF8, UTF16, etc.
Throws:
I18nUtilError

readUnicodeFile

public static java.lang.String readUnicodeFile(java.lang.String filePath,
                                               java.lang.String encoding)
                                        throws I18nUtilError
Reads the file at the specified path as a Unicode string in the specified encoding.

Parameters:
filePath - Path to file to read.
encoding - Encoding name (e.g. "UTF-16")
Returns:
String containing the file's contents.
Throws:
I18nUtilError

readUnicodeFile

public static java.lang.String readUnicodeFile(java.io.File file,
                                               java.lang.String encoding)
                                        throws I18nUtilError
Reads the specified file as a Unicode string in the specified encoding.

Parameters:
file - File to be read.
encoding - Encoding name (e.g., "UTF-16")
Returns:
String containing the file's contents.
Throws:
I18nUtilError

readUnicodeStream

public static java.lang.String readUnicodeStream(java.io.InputStream is,
                                                 java.lang.String encoding)
                                          throws I18nUtilError
Reads an InputStream as a Unicode string in the specified encoding.

Parameters:
is - InputStream to be read.
encoding - Encoding name (e.g., "UTF-16")
Returns:
String containing the stream's contents.
Throws:
I18nUtilError

writeUnicodeFile

public static void writeUnicodeFile(java.lang.String outString,
                                    java.lang.String filePath,
                                    java.lang.String encoding)
                             throws I18nUtilError
Writes a string to a file in the specified encoding.

Parameters:
outString - String to be written.
filePath - Path of file to write to.
encoding - Encoding name (e.g., "UTF-16")
Throws:
I18nUtilError

writeCollationRulesForLocale

public static void writeCollationRulesForLocale(java.util.Locale locale,
                                                java.lang.String outFilePath)
                                         throws I18nUtilError
Given a Java Locale object, constructs a RuleBasedCollator for the Locale, gets the collation rules, and writes them to a file.

Parameters:
locale - The Locale to get the rules for.
outFilePath - File to write the rules to.
Throws:
I18nUtilError

writeIcuCollationRulesForLocale

public static void writeIcuCollationRulesForLocale(java.util.Locale locale,
                                                   java.lang.String outFilePath)
                                            throws I18nUtilError
Given a Java Locale object, constructs a RuleBasedCollator for the Locale, gets the collation rules, and writes them to a file.

Parameters:
locale - The Locale to get the rules for.
outFilePath - File to write the rules to.
Throws:
I18nUtilError

getLangAttName

public static java.lang.String getLangAttName()
Returns the value of the langAttName property.

Returns:
language attribute name.

main

public static void main(java.lang.String[] args)