Interface AnalysisComponent

All Known Implementing Classes:
AnalysisComponent_ImplBase, Annotator_ImplBase, CasAnnotator_ImplBase, CasMultiplier_ImplBase, JCasAnnotator_ImplBase, JCasMultiplier_ImplBase, UimacppAnalysisComponent

public interface AnalysisComponent
Analysis Components are the primitive "building blocks" from which UIMA solutions are built. This is the common superinterface for all user-developed components that take a CAS as input and may produce CASes as output.

Typically, developers do not implement this interface directly. There are several abstract classes that you can inherit from depending on the function that your component performs and which CAS interface it uses:

  • Annotator: Receives an input CAS and updates it
  • CasConsumer_ImplBase: Receives an input CAS but does not update it. May update a data structure based on information in the CASes it receives.
  • CasMultiplier: Receives an input CAS and, in addition to updating it, may output new CASes. One common use of this is to split a CAS into pieces, emitting each piece as a separate output CAS.

The framework interacts with AnalysisComponents as follows:

  1. The framework calls the AnalysisComponent's process(AbstractCas) method with an input CAS.
  2. The framework then calls the AnalysisComponent's hasNext() method, which should return true if the AnalysisComponent intends to produce new output CASes, or false if the AnalysisComponent will not produce new output CASes.
  3. If the AnalysisComponent returns true, the framework will then call the next() method.
  4. The AnalysisComponent, in its next method, can create a new CAS by calling UimaContext.getEmptyCas(Class) (or instead, one of the helper methods in the ImplBase class that it extended). It then populates the empty CAS and returns it.
  5. Steps 2 & 3 continue for each subsequent output CAS, until hasNext() returns false.
From the time when process is called until the time when hasNext returns false, the AnalysisComponent "owns" the CAS that was passed to process. The AnalysisComponent is permitted to make changes to this CAS. Once hasNext returns false, the AnalysisComponent releases control of the initial CAS. This means that the AnalysisComponent must finish all updates to the initial CAS prior to returning false from hasNext.

However, if the process method is called a second time, before hasNext has returned false, this is a signal to the AnalysisComponent to cancel all processing of the previous CAS and begin processing the new CAS instead.

  • Method Details

    • initialize

      void initialize(UimaContext aContext) throws ResourceInitializationException
      Performs any startup tasks required by this component. The framework calls this method only once, just after the AnalysisComponent has been instantiated.

      The framework supplies this AnalysisComponent with a reference to the UimaContext that it will use, for example to access configuration settings or resources. This AnalysisComponent should store a reference to its the UimaContext for later use.

      Parameters:
      aContext - Provides access to services and resources managed by the framework. This includes configuration parameters, logging, and access to external resources.
      Throws:
      ResourceInitializationException - if this AnalysisComponent cannot initialize successfully.
    • reconfigure

      Alerts this AnalysisComponent that the values of its configuration parameters or external resources have changed. This AnalysisComponent should re-read its configuration from the UimaContext and take appropriate action to reconfigure itself.

      In the abstract base classes provided by the framework, this is generally implemented by calling destroy followed by initialize and typeSystemChanged. If a more efficient implementation is needed, you can override that implementation.

      Throws:
      ResourceConfigurationException - if the configuration specified for this component is invalid.
      ResourceInitializationException - if this component fails to reinitialize itself based on the new configuration.
    • batchProcessComplete

      void batchProcessComplete() throws AnalysisEngineProcessException
      Completes the processing of a batch of CASes. The size of a batch is determined based on configuration provided by the application that is using this component. The purpose of batchProcessComplete is to give this AnalysisComponent the change to flush information from memory to persistent storage. In the event of an error, this allows the processing to be restarted from the end of the last completed batch.

      If this component's descriptor declares that it is recoverable, then this component is required to be restartable from the end of the last completed batch.

      Throws:
      AnalysisEngineProcessException - if this component encounters a problem in flushing its state to persistent storage
    • collectionProcessComplete

      void collectionProcessComplete() throws AnalysisEngineProcessException
      Notifies this AnalysisComponent that processing of an entire collection has been completed. In this method, this component should finish writing any output relating to the current collection.
      Throws:
      AnalysisEngineProcessException - if this component encounters a problem in its end-of-collection processing
    • destroy

      void destroy()
      Frees all resources held by this AnalysisComponent. The framework calls this method only once, when it is finished using this component.
    • process

      void process(AbstractCas aCAS) throws AnalysisEngineProcessException
      Inputs a CAS to the AnalysisComponent. The AnalysisComponent "owns" this CAS until such time as hasNext() is called and returns false or until process is called again (see class description).
      Parameters:
      aCAS - A CAS that this AnalysisComponent should process. The framework will ensure that aCAS implements the specific CAS interface specified by the getRequiredCasInterface() method.
      Throws:
      AnalysisEngineProcessException - if a problem occurs during processing
    • hasNext

      boolean hasNext() throws AnalysisEngineProcessException
      Asks if this AnalysisComponent has another CAS to output. If this method returns true, then a call to next() should retrieve the next output CAS. When this method returns false, the AnalysisComponent gives up control of the initial CAS that was passed to its process(AbstractCas) method.
      Returns:
      true if this AnalysisComponent has another CAS to output, false if not.
      Throws:
      AnalysisEngineProcessException - if a problem occurs during processing
    • next

      Gets the next output CAS. The framework will only call this method after first calling hasNext() and checking that it returns true.

      The AnalysisComponent can obtain a new CAS by calling UimaContext.getEmptyCas(Class) (or instead, one of the helper methods in the ImplBase class that it extended).

      Returns:
      the next output CAS.
      Throws:
      AnalysisEngineProcessException - if a problem occurs during processing
    • getRequiredCasInterface

      Class<? extends AbstractCas> getRequiredCasInterface()
      Returns the specific CAS interface that this AnalysisComponent requires the framework to pass to its process(AbstractCas) method.
      Returns:
      the required CAS interface. This must specify a subtype of AbstractCas.
    • getCasInstancesRequired

      int getCasInstancesRequired()
      Returns the maximum number of CAS instances that this AnalysisComponent expects to use at the same time. This only applies to CasMultipliers. Most CasMultipliers will only need one CAS at a time. Only if there is a clear need should this be overridden to return something greater than 1.
      Returns:
      the number of CAS instances required by this AnalysisComponent.
    • setResultSpecification

      void setResultSpecification(ResultSpecification aResultSpec)
      Sets the ResultSpecification that this AnalysisComponent should use. The ResultSpecification is a set of types and features that this AnalysisComponent is asked to produce. An Analysis Component may (but is not required to) optimize its processing by omitting the generation of any types or features that are not part of the ResultSpecification.
      Parameters:
      aResultSpec - the ResultSpecification for this Analysis Component to use.