org.apache.uima.collection
Interface CollectionProcessingEngine


public interface CollectionProcessingEngine

A CollectionProcessingEngine (CPE) processes a collection of artifacts (for text analysis applications, this will be a collection of documents) and produces collection-level results.

A CPE consists of a CollectionReader, zero or more AnalysisEngines and zero or more CasConsumers. The Collection Reader is responsible for reading artifacts from a collection and setting up the CAS. The AnalysisEngines analyze each CAS and the results are passed on to the CAS Consumers. CAS Consumers perform analysis over multiple CASes and generally produce collection-level results in some application-specific data structure.

Processing is started by calling the process() method. Processing can be controlled via thepause(), resume(), and stop() methods.

Listeners can register with the CPE by calling the addStatusCallbackListener(StatusCallbackListener) method. These listeners receive status callbacks during the processing. At any time, performance and progress reports are available from the getPerformanceReport() and getProgress() methods.

A CPE implementation may choose to implement parallelization of the processing, but this is not a requirement of the architecture.

Note that a CPE only supports processing one collection at a time. Attempting to start a new processing job while a previous processing job is running will result in an exception. Processing multiple collections simultaneously is done by instantiating and configuring multiple instances of the CPE.

A CollectionProcessingEngine instance can be obtained by calling UIMAFramework.produceCollectionProcessingEngine(CpeDescription).


Method Summary
 void addStatusCallbackListener(StatusCallbackListener aListener)
          Registers a listsner to receive status callbacks.
 CasProcessor[] getCasProcessors()
          Gets the CasProcessorss in this CPE, in the order in which they will be executed.
 BaseCollectionReader getCollectionReader()
          Gets the Collection Reader for this CPE.
 ProcessTrace getPerformanceReport()
          Gets a performance report for the processing that is currently occurring or has just completed.
 Progress[] getProgress()
          Gets a progress report for the processing that is currently occurring or has just completed.
 void initialize(CpeDescription aCpeDescription, Map<String,Object> aAdditionalParams)
          Initializes this CPE from a cpeDescription Applications do not need to call this method.
 boolean isPaused()
          Determines whether this CPE's processing is currently paused.
 boolean isProcessing()
          Determines whether this CPE is currently processing.
 void kill()
          Kill CPM hard.
 void pause()
          Pauses processing.
 void process()
          Initiates processing of a collection.
 void removeStatusCallbackListener(StatusCallbackListener aListener)
          Unregisters a status callback listener.
 void resume()
          Resumes processing that has been paused.
 void stop()
          Stops processing.
 

Method Detail

initialize

void initialize(CpeDescription aCpeDescription,
                Map<String,Object> aAdditionalParams)
                throws ResourceInitializationException
Initializes this CPE from a cpeDescription Applications do not need to call this method. It is called automatically by the framework and cannot be called a second time.

Parameters:
aCpeDescription - CPE description, generally parsed from an XML file
aAdditionalParams - a Map containing additional parameters. May be null if there are no parameters. Each class that implements this interface can decide what additional parameters it supports.
Throws:
ResourceInitializationException - if a failure occurs during initialization.
UIMA_IllegalStateException - if this method is called more than once on a single instance.

addStatusCallbackListener

void addStatusCallbackListener(StatusCallbackListener aListener)
Registers a listsner to receive status callbacks.

Parameters:
aListener - the listener to add

removeStatusCallbackListener

void removeStatusCallbackListener(StatusCallbackListener aListener)
Unregisters a status callback listener.

Parameters:
aListener - the listener to remove

process

void process()
             throws ResourceInitializationException
Initiates processing of a collection. This method starts the processing in another thread and returns immediately. Status of the processing can be obtained by registering a listener with the addStatusCallbackListener(StatusCallbackListener) method.

A CPE can only process one collection at a time. If this method is called while a previous processing request has not yet completed, a UIMA_IllegalStateException will result. To find out whether a CPE is free to begin another processing request, call the isProcessing() method.

Throws:
ResourceInitializationException - if an error occurs during initialization
UIMA_IllegalStateException - if this CPE is currently processing

isProcessing

boolean isProcessing()
Determines whether this CPE is currently processing. This means that a processing request has been submitted and has not yet completed or been stop()ped. If processing is paused, this method will still return true.

Returns:
true if and only if this CPE is currently processing.

pause

void pause()
Pauses processing. Processing can later be resumed by calling the resume() method.

Throws:
UIMA_IllegalStateException - if no processing is currently occuring

isPaused

boolean isPaused()
Determines whether this CPE's processing is currently paused.

Returns:
true if and only if this CPE's processing is currently paused.

resume

void resume()
Resumes processing that has been paused.

Throws:
UIMA_IllegalStateException - if processing is not currently paused

stop

void stop()
Stops processing.

Throws:
UIMA_IllegalStateException - if no processing is currently occuring

getPerformanceReport

ProcessTrace getPerformanceReport()
Gets a performance report for the processing that is currently occurring or has just completed.

Returns:
an object containing performance statistics

getProgress

Progress[] getProgress()
Gets a progress report for the processing that is currently occurring or has just completed.

Returns:
an array of Progress objects, each of which represents the progress in a different set of units (for example number of entities or bytes)

getCollectionReader

BaseCollectionReader getCollectionReader()
Gets the Collection Reader for this CPE.

Returns:
the collection reader

getCasProcessors

CasProcessor[] getCasProcessors()
Gets the CasProcessorss in this CPE, in the order in which they will be executed.

Returns:
an array of CasProcessors

kill

void kill()
Kill CPM hard.



Copyright © 2010 The Apache Software Foundation. All Rights Reserved.