Class CharacterUtils

java.lang.Object
org.apache.uima.internal.util.CharacterUtils

public class CharacterUtils extends Object
Collection of utilities for character handling. Contains utilities for semi-automatically creating lexer rules.
  • Constructor Details

    • CharacterUtils

      public CharacterUtils()
      Constructor for CharacterUtils.
  • Method Details

    • toUnicodeChar

      public static String toUnicodeChar(char c)
      Create a hex representation of the UTF-16 encoding of a Java char. This is the representation that's understood by Java when reading source code.
      Parameters:
      c - The char to be encoded.
      Returns:
      String Hex representation of character. For example, the result of encoding 'A' would be "A".
    • toHexString

      public static String toHexString(char c)
      Create a hex representation of the UTF-16 encoding of a Java char. This is the representation that's understood by the JavaCC lexer.
      Parameters:
      c - The char to be encoded.
      Returns:
      String Hex representation of character. For example, the result of encoding 'A' would be "0x0041".
    • getLetterRange

      public static ArrayList<org.apache.uima.internal.util.CharacterUtils.CharRange> getLetterRange()
      Generate an ArrayList of CharRanges for what Java considers to be a letter. I use this as input to Unicode agnostic lexers like ANTLR.
      Returns:
      ArrayList A list of character ranges.
    • getDigitRange

      public static ArrayList<org.apache.uima.internal.util.CharacterUtils.CharRange> getDigitRange()
      Generate an ArrayList of CharRanges for what Java considers to be a digit. I use this as input to Unicode agnostic lexers like ANTLR.
      Returns:
      ArrayList A list of character ranges.
    • printAntlrLexRule

      public static void printAntlrLexRule(String name, ArrayList<org.apache.uima.internal.util.CharacterUtils.CharRange> charRanges)
    • printJavaCCLexRule

      public static void printJavaCCLexRule(String name, ArrayList<org.apache.uima.internal.util.CharacterUtils.CharRange> charRanges)
    • main

      public static void main(String[] args)