java.lang.Object

org.apache.uima.fit.testing.factory.TokenBuilder<TOKEN_TYPE,SENTENCE_TYPE>

Type Parameters:: TOKEN_TYPE - the type system token type (e.g. org.apache.uima.fit.examples.type.Token); SENTENCE_TYPE - the type system sentence type (e.g. org.apache.uima.fit.examples.type.Sentence)

public class TokenBuilder<TOKEN_TYPE extends Annotation,SENTENCE_TYPE extends Annotation> extends Object

This class provides convenience methods for creating tokens and sentences and add them to a JCas.

Constructor Summary

Constructors

Constructor

Description

TokenBuilder(Class<TOKEN_TYPE> aTokenClass, Class<SENTENCE_TYPE> aSentenceClass)

Calls TokenBuilder(Class, Class, String, String) with the last two arguments as null.

TokenBuilder(Class<TOKEN_TYPE> aTokenClass, Class<SENTENCE_TYPE> aSentenceClass, String aPosFeatureName, String aStemFeatureName)

Instantiates a TokenBuilder with the type system information that the builder needs to build tokens.
Method Summary

Modifier and Type

Method

Description

void

buildTokens(JCas aJCas, String aText)

Builds white-space delimited tokens from the input text.

void

buildTokens(JCas aJCas, String aText, String aTokensString)

void

buildTokens(JCas aJCas, String aText, String aTokensString, String aPosTagsString)

void

buildTokens(JCas aJCas, String aText, String aTokensString, String aPosTagsString, String aStemsString)

Build tokens for the given text, tokens, part-of-speech tags, and word stems.

static <T extends Annotation, S extends Annotation> TokenBuilder<T,S>

create(Class<T> aTokenClass, Class<S> aSentenceClass)

Instantiates a TokenBuilder with the type system information that the builder needs to build tokens.

void

setPosFeatureName(String aPosFeatureName)

Set the feature name for the part-of-speech tag for your token type.

void

setStemFeatureName(String aStemFeatureName)

Set the feature name for the stem for your token type.

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Constructor Details
- TokenBuilder
  
  public TokenBuilder(Class<TOKEN_TYPE> aTokenClass, Class<SENTENCE_TYPE> aSentenceClass)
  
  Calls TokenBuilder(Class, Class, String, String) with the last two arguments as null.
  
  Parameters:
  
  aTokenClass - the class of your token type from your type system (e.g. org.apache.uima.fit.type.Token.class)
  
  aSentenceClass - the class of your sentence type from your type system (e.g. org.apache.uima.fit.type.Sentence.class)
- TokenBuilder
  
  public TokenBuilder(Class<TOKEN_TYPE> aTokenClass, Class<SENTENCE_TYPE> aSentenceClass, String aPosFeatureName, String aStemFeatureName)
  
  Instantiates a TokenBuilder with the type system information that the builder needs to build tokens.
  
  Parameters:
  
  aTokenClass - the class of your token type from your type system (e.g. org.apache.uima.fit.type.Token.class)
  
  aSentenceClass - the class of your sentence type from your type system (e.g. org.apache.uima.fit.type.Sentence.class)
  
  aPosFeatureName - the feature name for the part-of-speech tag for your token type. This assumes that there is a single string feature for which to put your pos tag. null is an ok value.
  
  aStemFeatureName - the feature name for the stem for your token type. This assumes that there is a single string feature for which to put your stem. null is an ok value.
Method Details
- create
  
  public static <T extends Annotation, S extends Annotation> TokenBuilder<T,S> create(Class<T> aTokenClass, Class<S> aSentenceClass)
  
  Instantiates a TokenBuilder with the type system information that the builder needs to build tokens.
  
  Type Parameters:
  
  T - the type system token type (e.g. org.apache.uima.fit.examples.type.Token)
  
  S - the type system sentence type (e.g. org.apache.uima.fit.examples.type.Sentence)
  
  Parameters:
  
  aTokenClass - the class of your token type from your type system (e.g. org.apache.uima.fit.type.Token)
  
  aSentenceClass - the class of your sentence type from your type system (e.g. org.apache.uima.fit.type.Sentence)
  
  Returns:
  
  the builder.
- setPosFeatureName
  
  public void setPosFeatureName(String aPosFeatureName)
  
  Set the feature name for the part-of-speech tag for your token type. This assumes that there is a single string feature for which to put your pos tag. null is an ok value.
  
  Parameters:
  
  aPosFeatureName - the part-of-speech feature name.
- setStemFeatureName
  
  public void setStemFeatureName(String aStemFeatureName)
  
  Set the feature name for the stem for your token type. This assumes that there is a single string feature for which to put your stem. null is an ok value.
  
  Parameters:
  
  aStemFeatureName - the stem feature name.
- buildTokens
  
  public void buildTokens(JCas aJCas, String aText)
  
  Builds white-space delimited tokens from the input text.
  
  Parameters:
  
  aJCas - the JCas to add the Token annotations to
  
  aText - the text to initialize the JCas with
- buildTokens
  
  public void buildTokens(JCas aJCas, String aText, String aTokensString)
  Parameters:
  
  aJCas - the JCas to add the Token annotations to
  
  aText - the text to initialize the JCas with
  
  aTokensString - the tokensString must have the same non-white space characters as the text. The tokensString is used to identify token boundaries using white space - i.e. the only difference between the 'text' parameter and the 'tokensString' parameter is that the latter may have more whitespace characters. For example, if the text is "She ran." then the tokensString might be "She ran ."
  
  See Also:
  
  buildTokens(JCas, String, String, String, String)
- buildTokens
  
  public void buildTokens(JCas aJCas, String aText, String aTokensString, String aPosTagsString)
  Parameters:
  
  aJCas - the JCas to add the Token annotations to
  
  aText - the text to initialize the JCas with
  
  aTokensString - the tokensString must have the same non-white space characters as the text. The tokensString is used to identify token boundaries using white space - i.e. the only difference between the 'text' parameter and the 'tokensString' parameter is that the latter may have more whitespace characters. For example, if the text is "She ran." then the tokensString might be "She ran ."
  
  aPosTagsString - the posTagsString should be a space delimited string of part-of-speech tags - one for each token
  
  See Also:
  
  buildTokens(JCas, String, String, String, String)
- buildTokens
  
  public void buildTokens(JCas aJCas, String aText, String aTokensString, String aPosTagsString, String aStemsString)
  
  Build tokens for the given text, tokens, part-of-speech tags, and word stems.
  
  Parameters:
  
  aJCas - the JCas to add the Token annotations to
  
  aText - the text to initialize the JCas with
  
  aTokensString - the tokensString must have the same non-white space characters as the text. The tokensString is used to identify token boundaries using white space - i.e. the only difference between the 'text' parameter and the 'tokensString' parameter is that the latter may have more whitespace characters. For example, if the text is "She ran." then the tokensString might be "She ran ."
  
  aPosTagsString - the posTagsString should be a space delimited string of part-of-speech tags - one for each token
  
  aStemsString - the stemsString should be a space delimited string of stems - one for each token

Class TokenBuilder<TOKEN_TYPE extends Annotation,SENTENCE_TYPE extends Annotation>

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Constructor Details

TokenBuilder

TokenBuilder

Method Details

create

setPosFeatureName

setStemFeatureName

buildTokens

buildTokens

buildTokens

buildTokens