org.apache.uima.analysis_component
Interface AnalysisComponent

All Known Implementing Classes:
AnalysisComponent_ImplBase, Annotator_ImplBase, CasAnnotator_ImplBase, CasMultiplier_ImplBase, JCasAnnotator_ImplBase, JCasMultiplier_ImplBase, UimacppAnalysisComponent

public interface AnalysisComponent

Analysis Components are the primitive "building blocks" from which UIMA solutions are built. This is the common superinterface for all user-developed components that take a CAS as input and may produce CASes as output.

Typically, developers do not implement this interface directly. There are several abstract classes that you can inherit from depending on the function that your component performs and which CAS interface it uses:

The framework interacts with AnalysisComponents as follows:

  1. The framework calls the AnalysisComponent's process(AbstractCas) method with an input CAS.
  2. The framework then calls the AnalysisComponent's hasNext() method, which should return true if the AnalysisComponent intends to produce new output CASes, or false if the AnalysisComponent will not produce new output CASes.
  3. If the AnalysisComponent returns true, the framework will then call the next() method.
  4. The AnalysisComponent, in its next method, can create a new CAS by calling UimaContext.getEmptyCas(Class) (or instead, one of the helper methods in the ImplBase class that it extended). It then populates the empty CAS and returns it.
  5. Steps 2 & 3 continue for each subsequent output CAS, until hasNext() returns false.
From the time when process is called until the time when hasNext returns false, the AnalysisComponent "owns" the CAS that was passed to process. The AnalysisComponent is permitted to make changes to this CAS. Once hasNext returns false, the AnalysisComponent releases control of the initial CAS. This means that the AnalysisComponent must finish all updates to the initial CAS prior to returning false from hasNext.

However, if the process method is called a second time, before hasNext has returned false, this is a signal to the AnalysisComponent to cancel all processing of the previous CAS and begin processing the new CAS instead.


Method Summary
 void batchProcessComplete()
          Completes the processing of a batch of CASes.
 void collectionProcessComplete()
          Notifies this AnalysisComponent that processing of an entire collection has been completed.
 void destroy()
          Frees all resources held by this AnalysisComponent.
 int getCasInstancesRequired()
          Returns the maximum number of CAS instances that this AnalysisComponent expects to use at the same time.
 Class<? extends AbstractCas> getRequiredCasInterface()
          Returns the specific CAS interface that this AnalysisComponent requires the framework to pass to its process(AbstractCas) method.
 boolean hasNext()
          Asks if this AnalysisComponent has another CAS to output.
 void initialize(UimaContext aContext)
          Performs any startup tasks required by this component.
 AbstractCas next()
          Gets the next output CAS.
 void process(AbstractCas aCAS)
          Inputs a CAS to the AnalysisComponent.
 void reconfigure()
          Alerts this AnalysisComponent that the values of its configuration parameters or external resources have changed.
 void setResultSpecification(ResultSpecification aResultSpec)
          Sets the ResultSpecification that this AnalysisComponent should use.
 

Method Detail

initialize

void initialize(UimaContext aContext)
                throws ResourceInitializationException
Performs any startup tasks required by this component. The framework calls this method only once, just after the AnalysisComponent has been instantiated.

The framework supplies this AnalysisComponent with a reference to the UimaContext that it will use, for example to access configuration settings or resources. This AnalysisComponent should store a reference to its the UimaContext for later use.

Parameters:
aContext - Provides access to services and resources managed by the framework. This includes configuration parameters, logging, and access to external resources.
Throws:
ResourceInitializationException - if this AnalysisComponent cannot initialize successfully.

reconfigure

void reconfigure()
                 throws ResourceInitializationException,
                        ResourceConfigurationException
Alerts this AnalysisComponent that the values of its configuration parameters or external resources have changed. This AnalysisComponent should re-read its configuration from the UimaContext and take appropriate action to reconfigure itself.

In the abstract base classes provided by the framework, this is generally implemented by calling destroy followed by initialize and typeSystemChanged. If a more efficient implementation is needed, you can override that implementation.

Throws:
ResourceConfigurationException - if the configuration specified for this component is invalid.
ResourceInitializationException - if this component fails to reinitialize itself based on the new configuration.

batchProcessComplete

void batchProcessComplete()
                          throws AnalysisEngineProcessException
Completes the processing of a batch of CASes. The size of a batch is determined based on configuration provided by the application that is using this component. The purpose of batchProcessComplete is to give this AnalysisComponent the change to flush information from memory to persistent storage. In the event of an error, this allows the processing to be restarted from the end of the last completed batch.

If this component's descriptor declares that it is recoverable, then this component is required to be restartable from the end of the last completed batch.

Throws:
AnalysisEngineProcessException - if this component encounters a problem in flushing its state to persistent storage

collectionProcessComplete

void collectionProcessComplete()
                               throws AnalysisEngineProcessException
Notifies this AnalysisComponent that processing of an entire collection has been completed. In this method, this component should finish writing any output relating to the current collection.

Throws:
AnalysisEngineProcessException - if this component encounters a problem in its end-of-collection processing

destroy

void destroy()
Frees all resources held by this AnalysisComponent. The framework calls this method only once, when it is finished using this component.


process

void process(AbstractCas aCAS)
             throws AnalysisEngineProcessException
Inputs a CAS to the AnalysisComponent. The AnalysisComponent "owns" this CAS until such time as hasNext() is called and returns false or until process is called again (see class description).

Parameters:
aCAS - A CAS that this AnalysisComponent should process. The framework will ensure that aCAS implements the specific CAS interface specified by the getRequiredCasInterface() method.
Throws:
AnalysisEngineProcessException - if a problem occurs during processing

hasNext

boolean hasNext()
                throws AnalysisEngineProcessException
Asks if this AnalysisComponent has another CAS to output. If this method returns true, then a call to next() should retrieve the next output CAS. When this method returns false, the AnalysisComponent gives up control of the initial CAS that was passed to its process(AbstractCas) method.

Returns:
true if this AnalysisComponent has another CAS to output, false if not.
Throws:
AnalysisEngineProcessException - if a problem occurs during processing

next

AbstractCas next()
                 throws AnalysisEngineProcessException
Gets the next output CAS. The framework will only call this method after first calling hasNext() and checking that it returns true.

The AnalysisComponent can obtain a new CAS by calling UimaContext.getEmptyCas(Class) (or instead, one of the helper methods in the ImplBase class that it extended).

Returns:
the next output CAS.
Throws:
AnalysisEngineProcessException - if a problem occurs during processing

getRequiredCasInterface

Class<? extends AbstractCas> getRequiredCasInterface()
Returns the specific CAS interface that this AnalysisComponent requires the framework to pass to its process(AbstractCas) method.

Returns:
the required CAS interface. This must specify a subtype of AbstractCas.

getCasInstancesRequired

int getCasInstancesRequired()
Returns the maximum number of CAS instances that this AnalysisComponent expects to use at the same time. This only applies to CasMultipliers. Most CasMultipliers will only need one CAS at a time. Only if there is a clear need should this be overridden to return something greater than 1.

Returns:
the number of CAS instances required by this AnalysisComponent.

setResultSpecification

void setResultSpecification(ResultSpecification aResultSpec)
Sets the ResultSpecification that this AnalysisComponent should use. The ResultSpecification is a set of types and features that this AnalysisComponent is asked to produce. An Analysis Component may (but is not required to) optimize its processing by omitting the generation of any types or features that are not part of the ResultSpecification.

Parameters:
aResultSpec - the ResultSpecification for this Analysis Component to use.


Copyright © 2010 The Apache Software Foundation. All Rights Reserved.