Interface CollectionProcessingEngine
CollectionProcessingEngine
(CPE) processes a collection of artifacts (for text
analysis applications, this will be a collection of documents) and produces collection-level
results.
A CPE consists of a CollectionReader
, zero or more
AnalysisEngine
s and zero or more
CasConsumer
s. The Collection Reader is responsible for reading
artifacts from a collection and setting up the CAS. The AnalysisEngines analyze each CAS and the
results are passed on to the CAS Consumers. CAS Consumers perform analysis over multiple CASes
and generally produce collection-level results in some application-specific data structure.
Processing is started by calling the process()
method. Processing can be controlled via
thepause()
, resume()
, and stop()
methods.
Listeners can register with the CPE by calling the
addStatusCallbackListener(StatusCallbackListener)
method. These listeners receive status
callbacks during the processing. At any time, performance and progress reports are available from
the getPerformanceReport()
and getProgress()
methods.
A CPE implementation may choose to implement parallelization of the processing, but this is not a requirement of the architecture.
Note that a CPE only supports processing one collection at a time. Attempting to start a new processing job while a previous processing job is running will result in an exception. Processing multiple collections simultaneously is done by instantiating and configuring multiple instances of the CPE.
A CollectionProcessingEngine
instance can be obtained by calling
UIMAFramework.produceCollectionProcessingEngine(CpeDescription)
.
-
Method Summary
Modifier and TypeMethodDescriptionvoid
addStatusCallbackListener
(StatusCallbackListener aListener) Registers a listener to receive status callbacks.Gets theCasProcessors
s in this CPE, in the order in which they will be executed.Gets the Collection Reader for this CPE.Gets a performance report for the processing that is currently occurring or has just completed.Progress[]
Gets a progress report for the processing that is currently occurring or has just completed.void
initialize
(CpeDescription aCpeDescription, Map<String, Object> aAdditionalParams) Initializes this CPE from acpeDescription
Applications do not need to call this method.boolean
isPaused()
Determines whether this CPE's processing is currently paused.boolean
Determines whether this CPE is currently processing.void
kill()
Kill CPM hard.void
pause()
Pauses processing.void
process()
Initiates processing of a collection.void
Unregisters a status callback listener.void
resume()
Resumes processing that has been paused.void
stop()
Stops processing.
-
Method Details
-
initialize
void initialize(CpeDescription aCpeDescription, Map<String, Object> aAdditionalParams) throws ResourceInitializationExceptionInitializes this CPE from acpeDescription
Applications do not need to call this method. It is called automatically by the framework and cannot be called a second time.- Parameters:
aCpeDescription
- CPE description, generally parsed from an XML fileaAdditionalParams
- a Map containing additional parameters. May benull
if there are no parameters. Each class that implements this interface can decide what additional parameters it supports.- Throws:
ResourceInitializationException
- if a failure occurs during initialization.UIMA_IllegalStateException
- if this method is called more than once on a single instance.
-
addStatusCallbackListener
Registers a listener to receive status callbacks.- Parameters:
aListener
- the listener to add
-
removeStatusCallbackListener
Unregisters a status callback listener.- Parameters:
aListener
- the listener to remove
-
process
Initiates processing of a collection. This method starts the processing in another thread and returns immediately. Status of the processing can be obtained by registering a listener with theaddStatusCallbackListener(StatusCallbackListener)
method.A CPE can only process one collection at a time. If this method is called while a previous processing request has not yet completed, a
UIMA_IllegalStateException
will result. To find out whether a CPE is free to begin another processing request, call theisProcessing()
method.- Throws:
ResourceInitializationException
- if an error occurs during initializationUIMA_IllegalStateException
- if this CPE is currently processing
-
isProcessing
boolean isProcessing()Determines whether this CPE is currently processing. This means that a processing request has been submitted and has not yet completed or beenstop()
ped. If processing is paused, this method will still returntrue
.- Returns:
- true if and only if this CPE is currently processing.
-
pause
void pause()Pauses processing. Processing can later be resumed by calling theresume()
method.- Throws:
UIMA_IllegalStateException
- if no processing is currently occuring
-
isPaused
boolean isPaused()Determines whether this CPE's processing is currently paused.- Returns:
- true if and only if this CPE's processing is currently paused.
-
resume
void resume()Resumes processing that has been paused.- Throws:
UIMA_IllegalStateException
- if processing is not currently paused
-
stop
void stop()Stops processing.- Throws:
UIMA_IllegalStateException
- if no processing is currently occuring
-
getPerformanceReport
ProcessTrace getPerformanceReport()Gets a performance report for the processing that is currently occurring or has just completed.- Returns:
- an object containing performance statistics
-
getProgress
Progress[] getProgress()Gets a progress report for the processing that is currently occurring or has just completed.- Returns:
- an array of
Progress
objects, each of which represents the progress in a different set of units (for example number of entities or bytes)
-
getCollectionReader
BaseCollectionReader getCollectionReader()Gets the Collection Reader for this CPE.- Returns:
- the collection reader
-
getCasProcessors
CasProcessor[] getCasProcessors()Gets theCasProcessors
s in this CPE, in the order in which they will be executed.- Returns:
- an array of
CasProcessor
s
-
kill
void kill()Kill CPM hard.
-