Class SimplePipeline

java.lang.Object
org.apache.uima.fit.pipeline.SimplePipeline

public final class SimplePipeline extends Object
  • Method Summary

    Modifier and Type
    Method
    Description
    iteratePipeline(org.apache.uima.collection.CollectionReaderDescription aReader, org.apache.uima.analysis_engine.AnalysisEngineDescription... aEngines)
    Iterate through the JCases processed by the pipeline, allowing to access each one after it has been processed.
    static void
    runPipeline(org.apache.uima.cas.CAS cas, org.apache.uima.analysis_engine.AnalysisEngine... engines)
    Run a sequence of analysis engines over a CAS.
    static void
    runPipeline(org.apache.uima.cas.CAS aCas, org.apache.uima.analysis_engine.AnalysisEngineDescription... aDescs)
    Run a sequence of analysis engines over a JCas.
    static void
    runPipeline(org.apache.uima.collection.CollectionReaderDescription readerDesc, org.apache.uima.analysis_engine.AnalysisEngineDescription... descs)
    Run the CollectionReader and AnalysisEngines as a pipeline.
    static void
    runPipeline(org.apache.uima.collection.CollectionReader reader, org.apache.uima.analysis_engine.AnalysisEngine... engines)
    Provides a simple way to run a pipeline for a given collection reader and sequence of analysis engines.
    static void
    runPipeline(org.apache.uima.collection.CollectionReader reader, org.apache.uima.analysis_engine.AnalysisEngineDescription... descs)
    Run the CollectionReader and AnalysisEngines as a pipeline.
    static void
    runPipeline(org.apache.uima.jcas.JCas jCas, org.apache.uima.analysis_engine.AnalysisEngine... engines)
    Run a sequence of analysis engines over a JCas.
    static void
    runPipeline(org.apache.uima.jcas.JCas jCas, org.apache.uima.analysis_engine.AnalysisEngineDescription... descs)
    Run a sequence of analysis engines over a JCas.
    static void
    runPipeline(org.apache.uima.resource.ResourceManager aResMgr, org.apache.uima.collection.CollectionReader reader, org.apache.uima.analysis_engine.AnalysisEngine... engines)
    Provides a simple way to run a pipeline for a given collection reader and sequence of analysis engines.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Method Details

    • runPipeline

      public static void runPipeline(org.apache.uima.collection.CollectionReader reader, org.apache.uima.analysis_engine.AnalysisEngineDescription... descs) throws IOException, org.apache.uima.resource.ResourceInitializationException, org.apache.uima.analysis_engine.AnalysisEngineProcessException, org.apache.uima.collection.CollectionException

      Run the CollectionReader and AnalysisEngines as a pipeline. After processing all CASes provided by the reader, the method calls the life-cycle methods (collectionProcessComplete() on the engines and destroy()) on all engines. Note that the life-cycle methods are NOT called on the reader. As the reader was instantiated by the caller, it must also be managed (i.e. destroyed) the caller.

      Note that with this method, external resources cannot be shared between the reader and the analysis engines. They can be shared amongst the analysis engines.

      The CAS is created using the resource manager used by the collection reader.

      Parameters:
      reader - The CollectionReader that loads the documents into the CAS.
      descs - Primitive AnalysisEngineDescriptions that process the CAS, in order. If you have a mix of primitive and aggregate engines, then please create the AnalysisEngines yourself and call the other runPipeline method.
      Throws:
      IOException - if there is an I/O problem in the reader
      org.apache.uima.resource.ResourceInitializationException - if there is a problem initializing or running the pipeline.
      org.apache.uima.collection.CollectionException - if there is a problem initializing or running the pipeline.
      org.apache.uima.analysis_engine.AnalysisEngineProcessException - if there is a problem initializing or running the pipeline.
    • runPipeline

      public static void runPipeline(org.apache.uima.collection.CollectionReaderDescription readerDesc, org.apache.uima.analysis_engine.AnalysisEngineDescription... descs) throws IOException, org.apache.uima.resource.ResourceInitializationException, org.apache.uima.analysis_engine.AnalysisEngineProcessException, org.apache.uima.collection.CollectionException

      Run the CollectionReader and AnalysisEngines as a pipeline. After processing all CASes provided by the reader, the method calls collectionProcessComplete() on the engines, close() on the reader and destroy() on the reader and all engines.

      External resources can be shared between the reader and the analysis engines.

      This method is suitable for the batch-processing of sets of documents where the overheaded of instantiating the pipeline components does not significantly impact the overall runtime of the pipeline. If you need to avoid this overhead, e.g. because you wish to run a pipeline on individual documents, then you should not use this method. Instead, create a CAS using JCasFactory, create a reader instance using CollectionReaderFactory.createReader(java.lang.String, java.lang.Object...), create an engine instance using AnalysisEngineFactory.createEngine(java.lang.String, java.lang.Object...) and then use a loop to process the data, resetting the CAS after each step.

       
         while (reader.hasNext()) {
           reader.getNext(cas);
           engine.process(cas);
           cas.reset();
         }
       
       
      Parameters:
      readerDesc - The CollectionReader that loads the documents into the CAS.
      descs - Primitive AnalysisEngineDescriptions that process the CAS, in order. If you have a mix of primitive and aggregate engines, then please create the AnalysisEngines yourself and call the other runPipeline method.
      Throws:
      IOException - if there is an I/O problem in the reader
      org.apache.uima.resource.ResourceInitializationException - if there is a problem initializing or running the pipeline.
      org.apache.uima.collection.CollectionException - if there is a problem initializing or running the pipeline.
      org.apache.uima.analysis_engine.AnalysisEngineProcessException - if there is a problem initializing or running the pipeline.
    • runPipeline

      public static void runPipeline(org.apache.uima.collection.CollectionReader reader, org.apache.uima.analysis_engine.AnalysisEngine... engines) throws IOException, org.apache.uima.analysis_engine.AnalysisEngineProcessException, org.apache.uima.resource.ResourceInitializationException, org.apache.uima.collection.CollectionException

      Provides a simple way to run a pipeline for a given collection reader and sequence of analysis engines. After processing all CASes provided by the reader, the method calls collectionProcessComplete() on the engines. Note that Resource.destroy() and Resource.destroy() are NOT called. As the components were instantiated by the caller, they must also be managed (i.e. destroyed) the caller.

      External resources can only be shared between the reader and/or the analysis engines if the reader/engines have been previously instantiated using a shared resource manager.

      The CAS is created using the resource manager used by the collection reader.

      Parameters:
      reader - a collection reader
      engines - a sequence of analysis engines
      Throws:
      IOException - if there is an I/O problem in the reader
      org.apache.uima.collection.CollectionException - if there is a problem initializing or running the pipeline.
      org.apache.uima.resource.ResourceInitializationException - if there is a problem initializing or running the pipeline.
      org.apache.uima.analysis_engine.AnalysisEngineProcessException - if there is a problem initializing or running the pipeline.
    • runPipeline

      public static void runPipeline(org.apache.uima.resource.ResourceManager aResMgr, org.apache.uima.collection.CollectionReader reader, org.apache.uima.analysis_engine.AnalysisEngine... engines) throws IOException, org.apache.uima.resource.ResourceInitializationException, org.apache.uima.analysis_engine.AnalysisEngineProcessException, org.apache.uima.collection.CollectionException

      Provides a simple way to run a pipeline for a given collection reader and sequence of analysis engines. After processing all CASes provided by the reader, the method calls collectionProcessComplete() on the engines. Note that Resource.destroy() and Resource.destroy() are NOT called. As the components were instantiated by the caller, they must also be managed (i.e. destroyed) the caller.

      External resources can only be shared between the reader and/or the analysis engines if the reader/engines have been previously instantiated using a shared resource manager.

      Parameters:
      aResMgr - a resource manager. Normally the same one used by the collection reader and analysis engines.
      reader - a collection reader
      engines - a sequence of analysis engines
      Throws:
      IOException - if there is an I/O problem in the reader
      org.apache.uima.resource.ResourceInitializationException - if there is a problem initializing or running the pipeline.
      org.apache.uima.collection.CollectionException - if there is a problem initializing or running the pipeline.
      org.apache.uima.analysis_engine.AnalysisEngineProcessException - if there is a problem initializing or running the pipeline.
    • runPipeline

      public static void runPipeline(org.apache.uima.cas.CAS aCas, org.apache.uima.analysis_engine.AnalysisEngineDescription... aDescs) throws org.apache.uima.resource.ResourceInitializationException, org.apache.uima.analysis_engine.AnalysisEngineProcessException

      Run a sequence of analysis engines over a JCas. The result of the analysis can be read from the JCas.

      External resources can be shared between the analysis engines.

      Parameters:
      aCas - the CAS to process
      aDescs - a sequence of analysis engines to run on the jCas
      Throws:
      org.apache.uima.resource.ResourceInitializationException - if there is a problem initializing the components
      org.apache.uima.analysis_engine.AnalysisEngineProcessException - if there is a problem during the execution of the components
    • runPipeline

      public static void runPipeline(org.apache.uima.jcas.JCas jCas, org.apache.uima.analysis_engine.AnalysisEngineDescription... descs) throws org.apache.uima.analysis_engine.AnalysisEngineProcessException, org.apache.uima.resource.ResourceInitializationException

      Run a sequence of analysis engines over a JCas. The result of the analysis can be read from the JCas.

      External resources can be shared between the analysis engines.

      Parameters:
      jCas - the jCas to process
      descs - a sequence of analysis engines to run on the jCas
      Throws:
      org.apache.uima.resource.ResourceInitializationException - if there is a problem initializing the components
      org.apache.uima.analysis_engine.AnalysisEngineProcessException - if there is a problem during the execution of the components
    • runPipeline

      public static void runPipeline(org.apache.uima.jcas.JCas jCas, org.apache.uima.analysis_engine.AnalysisEngine... engines) throws org.apache.uima.analysis_engine.AnalysisEngineProcessException

      Run a sequence of analysis engines over a JCas. This method does not destroy the engines or send them other events like AnalysisEngine.collectionProcessComplete(). This is left to the caller.

      External resources can only be shared between the analysis engines if the engines have been previously instantiated using a shared resource manager.

      Parameters:
      jCas - the jCas to process
      engines - a sequence of analysis engines to run on the jCas
      Throws:
      org.apache.uima.analysis_engine.AnalysisEngineProcessException - if there is a problem during the execution of the components
    • runPipeline

      public static void runPipeline(org.apache.uima.cas.CAS cas, org.apache.uima.analysis_engine.AnalysisEngine... engines) throws org.apache.uima.analysis_engine.AnalysisEngineProcessException

      Run a sequence of analysis engines over a CAS. This method does not destroy the engines or send them other events like AnalysisEngine.collectionProcessComplete(). This is left to the caller.

      External resources can only be shared between the analysis engines if the engines have been previously instantiated using a shared resource manager.

      Parameters:
      cas - the CAS to process
      engines - a sequence of analysis engines to run on the jCas
      Throws:
      org.apache.uima.analysis_engine.AnalysisEngineProcessException - if there is a problem during the execution of the components
    • iteratePipeline

      public static JCasIterable iteratePipeline(org.apache.uima.collection.CollectionReaderDescription aReader, org.apache.uima.analysis_engine.AnalysisEngineDescription... aEngines)

      Iterate through the JCases processed by the pipeline, allowing to access each one after it has been processed.

      External resources can be shared between the reader and the analysis engines.

      Parameters:
      aReader - the collection reader.
      aEngines - the analysis engines.
      Returns:
      an Iterable<JCas> which can be used in an extended for-loop.