Interface AnalysisComponent
- All Known Implementing Classes:
AnalysisComponent_ImplBase
,Annotator_ImplBase
,CasAnnotator_ImplBase
,CasMultiplier_ImplBase
,JCasAnnotator_ImplBase
,JCasMultiplier_ImplBase
,UimacppAnalysisComponent
Typically, developers do not implement this interface directly. There are several abstract classes that you can inherit from depending on the function that your component performs and which CAS interface it uses:
- Annotator: Receives an input CAS and updates it
JCasAnnotator_ImplBase
: Uses JCas interfaceCasAnnotator_ImplBase
: Uses CASinterface
CasConsumer_ImplBase
: Receives an input CAS but does not update it. May update a data structure based on information in the CASes it receives.- CasMultiplier: Receives an input CAS and, in addition to updating it, may output new CASes.
One common use of this is to split a CAS into pieces, emitting each piece as a separate output
CAS.
JCasMultiplier_ImplBase
: Uses JCas interfaceCasMultiplier_ImplBase
: Uses CAS interfaceCollectionReader_ImplBase
: A special type of CasMultiplier that, for historical reasons, does not take an input CAS.
The framework interacts with AnalysisComponents as follows:
- The framework calls the AnalysisComponent's
process(AbstractCas)
method with an input CAS. - The framework then calls the AnalysisComponent's
hasNext()
method, which should returntrue
if the AnalysisComponent intends to produce new output CASes, orfalse
if the AnalysisComponent will not produce new output CASes. - If the AnalysisComponent returns
true
, the framework will then call thenext()
method. - The AnalysisComponent, in its
next
method, can create a new CAS by callingUimaContext.getEmptyCas(Class)
(or instead, one of the helper methods in the ImplBase class that it extended). It then populates the empty CAS and returns it. - Steps 2 & 3 continue for each subsequent output CAS, until
hasNext()
returns false.
process
is called until the time when hasNext
returns false, the AnalysisComponent "owns" the CAS that was passed to process
. The
AnalysisComponent is permitted to make changes to this CAS. Once hasNext
returns
false, the AnalysisComponent releases control of the initial CAS. This means that the
AnalysisComponent must finish all updates to the initial CAS prior to returning false from
hasNext
.
However, if the process
method is called a second time, before hasNext
has returned false, this is a signal to the AnalysisComponent to cancel all processing of the
previous CAS and begin processing the new CAS instead.
-
Method Summary
Modifier and TypeMethodDescriptionvoid
Completes the processing of a batch of CASes.void
Notifies this AnalysisComponent that processing of an entire collection has been completed.void
destroy()
Frees all resources held by this AnalysisComponent.int
Returns the maximum number of CAS instances that this AnalysisComponent expects to use at the same time.Class<? extends AbstractCas>
Returns the specific CAS interface that this AnalysisComponent requires the framework to pass to itsprocess(AbstractCas)
method.boolean
hasNext()
Asks if this AnalysisComponent has another CAS to output.void
initialize
(UimaContext aContext) Performs any startup tasks required by this component.next()
Gets the next output CAS.void
process
(AbstractCas aCAS) Inputs a CAS to the AnalysisComponent.void
Alerts this AnalysisComponent that the values of its configuration parameters or external resources have changed.void
setResultSpecification
(ResultSpecification aResultSpec) Sets the ResultSpecification that this AnalysisComponent should use.
-
Method Details
-
initialize
Performs any startup tasks required by this component. The framework calls this method only once, just after the AnalysisComponent has been instantiated.The framework supplies this AnalysisComponent with a reference to the
UimaContext
that it will use, for example to access configuration settings or resources. This AnalysisComponent should store a reference to its theUimaContext
for later use.- Parameters:
aContext
- Provides access to services and resources managed by the framework. This includes configuration parameters, logging, and access to external resources.- Throws:
ResourceInitializationException
- if this AnalysisComponent cannot initialize successfully.
-
reconfigure
Alerts this AnalysisComponent that the values of its configuration parameters or external resources have changed. This AnalysisComponent should re-read its configuration from theUimaContext
and take appropriate action to reconfigure itself.In the abstract base classes provided by the framework, this is generally implemented by calling
destroy
followed byinitialize
andtypeSystemChanged
. If a more efficient implementation is needed, you can override that implementation.- Throws:
ResourceConfigurationException
- if the configuration specified for this component is invalid.ResourceInitializationException
- if this component fails to reinitialize itself based on the new configuration.
-
batchProcessComplete
Completes the processing of a batch of CASes. The size of a batch is determined based on configuration provided by the application that is using this component. The purpose ofbatchProcessComplete
is to give this AnalysisComponent the change to flush information from memory to persistent storage. In the event of an error, this allows the processing to be restarted from the end of the last completed batch.If this component's descriptor declares that it is
recoverable
, then this component is required to be restartable from the end of the last completed batch.- Throws:
AnalysisEngineProcessException
- if this component encounters a problem in flushing its state to persistent storage
-
collectionProcessComplete
Notifies this AnalysisComponent that processing of an entire collection has been completed. In this method, this component should finish writing any output relating to the current collection.- Throws:
AnalysisEngineProcessException
- if this component encounters a problem in its end-of-collection processing
-
destroy
void destroy()Frees all resources held by this AnalysisComponent. The framework calls this method only once, when it is finished using this component. -
process
Inputs a CAS to the AnalysisComponent. The AnalysisComponent "owns" this CAS until such time ashasNext()
is called and returns false or untilprocess
is called again (see class description).- Parameters:
aCAS
- A CAS that this AnalysisComponent should process. The framework will ensure that aCAS implements the specific CAS interface specified by thegetRequiredCasInterface()
method.- Throws:
AnalysisEngineProcessException
- if a problem occurs during processing
-
hasNext
Asks if this AnalysisComponent has another CAS to output. If this method returns true, then a call tonext()
should retrieve the next output CAS. When this method returns false, the AnalysisComponent gives up control of the initial CAS that was passed to itsprocess(AbstractCas)
method.- Returns:
- true if this AnalysisComponent has another CAS to output, false if not.
- Throws:
AnalysisEngineProcessException
- if a problem occurs during processing
-
next
Gets the next output CAS. The framework will only call this method after first callinghasNext()
and checking that it returns true.The AnalysisComponent can obtain a new CAS by calling
UimaContext.getEmptyCas(Class)
(or instead, one of the helper methods in the ImplBase class that it extended).- Returns:
- the next output CAS.
- Throws:
AnalysisEngineProcessException
- if a problem occurs during processing
-
getRequiredCasInterface
Class<? extends AbstractCas> getRequiredCasInterface()Returns the specific CAS interface that this AnalysisComponent requires the framework to pass to itsprocess(AbstractCas)
method.- Returns:
- the required CAS interface. This must specify a subtype of
AbstractCas
.
-
getCasInstancesRequired
int getCasInstancesRequired()Returns the maximum number of CAS instances that this AnalysisComponent expects to use at the same time. This only applies to CasMultipliers. Most CasMultipliers will only need one CAS at a time. Only if there is a clear need should this be overridden to return something greater than 1.- Returns:
- the number of CAS instances required by this AnalysisComponent.
-
setResultSpecification
Sets the ResultSpecification that this AnalysisComponent should use. The ResultSpecification is a set of types and features that this AnalysisComponent is asked to produce. An Analysis Component may (but is not required to) optimize its processing by omitting the generation of any types or features that are not part of the ResultSpecification.- Parameters:
aResultSpec
- the ResultSpecification for this Analysis Component to use.
-