Interface BaseCPM

All Known Implementing Classes:
BaseCPMImpl, CPMImpl

public interface BaseCPM
The Base CPM interface is a lower-level interface to the Collection Processing Manager. It is recommended that developers use the CollectionProcessingEngine and CpeDescription interfaces instead.

The CPM is configured with a list of CasProcessors by calling its addCasProcessor(CasProcessor) method. A single BaseCollectionReader must be provided, via the setCollectionReader(BaseCollectionReader) method. Collection processing is then initiated by calling the process() method.

Listeners can register with the CPM by calling the addStatusCallbackListener(BaseStatusCallbackListener) method. These listeners receive status callbacks during the processing. At any time, performance and progress reports are available from the getPerformanceReport() and getProgress() methods.

A CPM implementation may choose to implement parallelization of the processing, but this is not a requirement of the architecture.

Note that a CPM only supports processing one collection at a time. Attempting to reconfigure a CPM or start a new processing job while a previous processing job is occurring will result in a UIMA_IllegalStateException. Processing multiple collections simultaneously is done by instantiating and configuring multiple instances of the CPM.

  • Field Details

    • DOCUMENT_TEXT_TYPE

      static final String DOCUMENT_TEXT_TYPE
      Only used for alternate CasData forms of the CAS (not used in this UIMA SDK release). Name of CasData CAS type that holds document text. When creating CasData forms of the CAS, a feature structure of this type must be created by the collection reader.
      See Also:
    • DOCUMENT_TEXT_FEATURE

      static final String DOCUMENT_TEXT_FEATURE
      Only used for alternate CasData forms of the CAS (not used in this UIMA SDK release). Name of CAS feature (on DOCUMENT_TEXT_TYPE feature structure) that holds document text. When creating CasDta forms of the CAS, this feature must be set by the collection reader.
      See Also:
  • Method Details

    • getCollectionReader

      BaseCollectionReader getCollectionReader()
      Gets the Collection Reader for this CPM.
      Returns:
      the collection reader
    • setCollectionReader

      void setCollectionReader(BaseCollectionReader aCollectionReader)
      Sets the Collection Reader for this CPM.
      Parameters:
      aCollectionReader - the collection reader
    • getCasProcessors

      CasProcessor[] getCasProcessors()
      Gets the CasProcessorss assigned to this CPM, in the order in which they will be called by the CPM.
      Returns:
      an array of CasProcessors
    • addCasProcessor

      void addCasProcessor(CasProcessor aCasProcessor) throws ResourceConfigurationException
      Adds a CasProcessor to this CPM's list of consumers. The new CasProcessor will be added to the end of the list of CAS Processors.
      Parameters:
      aCasProcessor - a CasProcessor to add
      Throws:
      ResourceConfigurationException - if this CPM is currently processing
    • addCasProcessor

      void addCasProcessor(CasProcessor aCasProcessor, int aIndex) throws ResourceConfigurationException
      Adds a CasProcessor to this CPM's list of consumers. The new CasProcessor will be added at the specified index.
      Parameters:
      aCasProcessor - the CasProcessor to add
      aIndex - the index at which to add the CasProcessor
      Throws:
      ResourceConfigurationException - if this CPM is currently processing
    • removeCasProcessor

      void removeCasProcessor(CasProcessor aCasProcessor)
      Removes a CasProcessor to this CPM's list of consumers.
      Parameters:
      aCasProcessor - the CasProcessor to remove
    • disableCasProcessor

      void disableCasProcessor(String aCasProcessorName)
      Disables a CasProcessor in this CPM's list of CasProcessors.
      Parameters:
      aCasProcessorName - the name of the CasProcessor to disable
    • isSerialProcessingRequired

      boolean isSerialProcessingRequired()
      Gets whether this CPM is required to process the collection's elements serially (as opposed to performing parallelization). Note that a value of false does not guarantee that parallelization is performed; this is left up to the CPM implementation.
      Returns:
      true if and only if serial processing is required
    • setSerialProcessingRequired

      void setSerialProcessingRequired(boolean aRequired)
      Sets whether this CPM is required to process the collection's elements serially (as opposed to performing parallelization). If this method is not called, the default is false. Note that a value of false does not guarantee that parallelization is performed; this is left up to the CPM implementation.
      Parameters:
      aRequired - true if and only if serial processing is required
    • isPauseOnException

      boolean isPauseOnException()
      Gets whether this CPM will automatically pause processing if an exception occurs. If processing is paused it can be resumed by calling the resume(boolean) method.
      Returns:
      true if and only if this CPM will pause on exception
    • setPauseOnException

      void setPauseOnException(boolean aPause)
      Sets whether this CPM will automatically pause processing if an exception occurs. If processing is paused it can be resumed by calling the resume(boolean) method.
      Parameters:
      aPause - true if and only if this CPM should pause on exception
    • addStatusCallbackListener

      void addStatusCallbackListener(BaseStatusCallbackListener aListener)
      Registers a listsner to receive status callbacks.
      Parameters:
      aListener - the listener to add
    • removeStatusCallbackListener

      void removeStatusCallbackListener(BaseStatusCallbackListener aListener)
      Unregisters a status callback listener.
      Parameters:
      aListener - the listener to remove
    • process

      void process() throws ResourceInitializationException
      Initiates processing of a collection. This method starts the processing in another thread and returns immediately. Status of the processing can be obtained by registering a listener with the addStatusCallbackListener(BaseStatusCallbackListener) method.

      A CPM can only process one collection at a time. If this method is called while a previous processing request has not yet completed, a UIMA_IllegalStateException will result. To find out whether a CPM is free to begin another processing request, call the isProcessing() method.

      Throws:
      ResourceInitializationException - if an error occurs during initialization
    • isProcessing

      boolean isProcessing()
      Determines whether this CPM is currently processing. This means that a processing request has been submitted and has not yet completed or been stop()ped. If processing is paused, this method will still return true.
      Returns:
      true if and only if this CPM is currently processing.
    • pause

      void pause()
      Pauses processing. Processing can later be resumed by calling the resume(boolean) method.
    • isPaused

      boolean isPaused()
      Determines whether this CPM's processing is currently paused.
      Returns:
      true if and only if this CPM's processing is currently paused.
    • resume

      void resume(boolean aRetryFailed)
      Resumes processing that has been paused.
      Parameters:
      aRetryFailed - if processing was paused because an exception occurred (see setPauseOnException(boolean)), setting a value of true for this parameter will cause the failed entity to be retried. A value of false (the default) will cause processing to continue with the next entity after the failure.
    • resume

      void resume()
      Resumes processing that has been paused.
    • stop

      void stop()
      Stops processing.
    • getPerformanceReport

      ProcessTrace getPerformanceReport()
      Gets a performance report for the processing that is currently occurring or has just completed.
      Returns:
      an object containing performance statistics
    • getProgress

      Progress[] getProgress()
      Gets a progress report for the processing that is currently occurring or has just completed.
      Returns:
      an array of Progress objects, each of which represents the progress in a different set of units (for example number of entities or bytes)