Class XMLUtils

java.lang.Object
org.apache.uima.internal.util.XMLUtils

public abstract class XMLUtils extends Object
Some utilities for working with XML. abstract only to prevent instantiation - all methods are static
  • Constructor Details

    • XMLUtils

      public XMLUtils()
  • Method Details

    • normalize

      public static void normalize(String aStr, StringBuffer aResultBuf)
      Normalizes the given string for output to XML. This converts all special characters, e.g. <, %gt;, &, to their XML representations, e.g. &lt;, &gt;, &amp;. The normalized string is appended to the specified StringBuffer.
      Parameters:
      aStr - input string
      aResultBuf - the StringBuffer to which the normalized string will be appended
    • normalize

      public static void normalize(String aStr, StringBuffer aResultBuf, boolean aNewlinesToSpaces)
      Normalizes the given string for output to XML. This converts all special characters, e.g. <, %gt;, &, to their XML representations, e.g. &lt;, &gt;, &amp;. Also may convert newlines to spaces, depending on the aNewlinesToSpaces parameter. The normalized string is appended to the specified StringBuffer.
      Parameters:
      aStr - input string
      aResultBuf - the StringBuffer to which the normalized string will be appended
      aNewlinesToSpaces - iff true, newlines (\r and \n) will be converted to spaces
    • writeNormalizedString

      public static void writeNormalizedString(String aStr, Writer aWriter, boolean aNewlinesToSpaces) throws IOException
      Normalizes the given string for output to XML, and writes the normalized string to the given Writer. Normalization converts all special characters, e.g. <, %gt;, &, to their XML representations, e.g. &lt;, &gt;, &amp;. Also may convert newlines to spaces, depending on the aNewlinesToSpaces parameter.
      Parameters:
      aStr - input string
      aWriter - a Writer to which the normalized string will be written
      aNewlinesToSpaces - iff true, newlines (\r and \n) will be converted to spaces
      Throws:
      IOException - if an I/O failure occurs when writing to aWriter
    • writePrimitiveValue

      public static void writePrimitiveValue(Object aObj, Writer aWriter) throws IOException
      Writes a standard XML representation of the specified Object, in the form:
      <className>string value%lt;/className%gt;

      where className is the object's java class name without the package and made lowercase, e.g. "string","integer", "boolean" and string value is the result of Object.toString().

      This is intended to be used for Java Strings and wrappers for primitive value classes (e.g. Integer, Boolean).

      Parameters:
      aObj - the object to write
      aWriter - a Writer to which the XML will be written
      Throws:
      IOException - if an I/O failure occurs when writing to aWriter
    • getChildByTagName

      public static Element getChildByTagName(Element aElem, String aName)
      Gets the first child of the given Element with the given tag name.
      Parameters:
      aElem - the parent element
      aName - tag name of the child to retrieve
      Returns:
      the first child of aElem with tag name aName, null if there is no such child.
    • getFirstChildElement

      public static Element getFirstChildElement(Element aElem)
      Gets the first child of the given Element.
      Parameters:
      aElem - the parent element
      Returns:
      the first child of aElem, null if it has no children.
    • readPrimitiveValue

      public static Object readPrimitiveValue(Element aElem)
      Reads a primitive value from its standard DOM representation. (This is the representation produced by writePrimitiveValue(Object, Writer).

      This is intended to be used for Java Strings and wrappers for primitive value classes (e.g. Integer, Boolean).

      Parameters:
      aElem - the element representing the value
      Returns:
      the value that was read, null if a primitive value could not be constructed from the element
    • getText

      public static String getText(Element aElem)
      Gets the text of this Element. Leading and trailing whitespace is removed.
      Parameters:
      aElem - the element
      Returns:
      the text of aElem
    • getText

      public static String getText(Element aElem, boolean aExpandEnvVarRefs)
      Gets the text of this Element. Leading and trailing whitespace is removed. Environment variable references of the form <envVarRef%gt;PARAM_NAME</envVarRef> may be expanded.
      Parameters:
      aElem - the element
      aExpandEnvVarRefs - whether to expand environment variable references. Defaults to false.
      Returns:
      the text of aElem
    • checkForNonXmlCharacters

      public static final int checkForNonXmlCharacters(String s)
      Check the input string for non-XML 1.0 characters. If non-XML characters are found, return the position of first offending character. Else, return -1.

      From the XML 1.0 spec:

         Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] // any Unicode
          character, excluding the surrogate blocks, FFFE, and FFFF.
       

      And from the UTF-16 spec:

      Characters with values between 0x10000 and 0x10FFFF are represented by a 16-bit integer with a value between 0xD800 and 0xDBFF (within the so-called high-half zone or high surrogate area) followed by a 16-bit integer with a value between 0xDC00 and 0xDFFF (within the so-called low-half zone or low surrogate area).

      Parameters:
      s - Input string
      Returns:
      The position of the first invalid XML character encountered. -1 if no invalid XML characters found.
    • checkForNonXmlCharacters

      public static final int checkForNonXmlCharacters(String s, boolean xml11)
      Check the input string for non-XML characters. If non-XML characters are found, return the position of first offending character. Else, return -1.

      The definition of an XML character is different for XML 1.0 and 1.1. This method will check either version, depending on the value of the xml11 argument.

      From the XML 1.0 spec:

         Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF] // any Unicode
          character, excluding the surrogate blocks, FFFE, and FFFF.
       

      From the XML 1.1 spec:

        Char     ::=    [#x1-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
       

      And from the UTF-16 spec:

      Characters with values between 0x10000 and 0x10FFFF are represented by a 16-bit integer with a value between 0xD800 and 0xDBFF (within the so-called high-half zone or high surrogate area) followed by a 16-bit integer with a value between 0xDC00 and 0xDFFF (within the so-called low-half zone or low surrogate area).

      Parameters:
      s - Input string
      xml11 - true to check for invalid XML 1.1 characters, false to check for invalid XML 1.0 characters. The default is false.
      Returns:
      The position of the first invalid XML character encountered. -1 if no invalid XML characters found.
    • checkForNonXmlCharacters

      public static final int checkForNonXmlCharacters(char[] ch, int start, int length, boolean xml11)
      Check the input character array for non-XML characters. If non-XML characters are found, return the position of first offending character. Else, return -1.
      Parameters:
      ch - Input character array
      start - offset of first char to check
      length - number of chars to check
      xml11 - true to check for invalid XML 1.1 characters, false to check for invalid XML 1.0 characters. The default is false.
      Returns:
      The position of the first invalid XML character encountered. -1 if no invalid XML characters found.
      See Also:
    • createSAXParserFactory

      public static SAXParserFactory createSAXParserFactory()
    • createXMLReader

      public static XMLReader createXMLReader() throws SAXException
      Throws:
      SAXException
    • createSaxTransformerFactory

      public static SAXTransformerFactory createSaxTransformerFactory()
    • createTransformerFactory

      public static TransformerFactory createTransformerFactory()
    • createDocumentBuilderFactory

      public static DocumentBuilderFactory createDocumentBuilderFactory()