Class SimplePipeline
-
Method Summary
Modifier and TypeMethodDescriptionstatic JCasIterable
iteratePipeline
(CollectionReaderDescription aReader, AnalysisEngineDescription... aEngines) Iterate through theJCases
processed by the pipeline, allowing to access each one after it has been processed.static void
runPipeline
(CAS cas, AnalysisEngine... engines) Run a sequence ofanalysis engines
over aCAS
.static void
runPipeline
(CAS aCas, AnalysisEngineDescription... aDescs) Run a sequence ofanalysis engines
over aJCas
.static void
runPipeline
(CollectionReaderDescription readerDesc, AnalysisEngineDescription... descs) Run the CollectionReader and AnalysisEngines as a pipeline.static void
runPipeline
(CollectionReader reader, AnalysisEngine... engines) Provides a simple way to run a pipeline for a given collection reader and sequence of analysis engines.static void
runPipeline
(CollectionReader reader, AnalysisEngineDescription... descs) Run the CollectionReader and AnalysisEngines as a pipeline.static void
runPipeline
(JCas jCas, AnalysisEngine... engines) Run a sequence ofanalysis engines
over aJCas
.static void
runPipeline
(JCas jCas, AnalysisEngineDescription... descs) Run a sequence ofanalysis engines
over aJCas
.static void
runPipeline
(ResourceManager aResMgr, CollectionReader reader, AnalysisEngine... engines) Provides a simple way to run a pipeline for a given collection reader and sequence of analysis engines.
-
Method Details
-
runPipeline
public static void runPipeline(CollectionReader reader, AnalysisEngineDescription... descs) throws IOException, ResourceInitializationException, AnalysisEngineProcessException, CollectionException Run the CollectionReader and AnalysisEngines as a pipeline. After processing all CASes provided by the reader, the method calls the life-cycle methods (
collectionProcessComplete()
on the engines anddestroy()
) on all engines. Note that the life-cycle methods are NOT called on the reader. As the reader was instantiated by the caller, it must also be managed (i.e. destroyed) the caller.Note that with this method, external resources cannot be shared between the reader and the analysis engines. They can be shared amongst the analysis engines.
The CAS is created using the resource manager used by the collection reader.
- Parameters:
reader
- The CollectionReader that loads the documents into the CAS.descs
- Primitive AnalysisEngineDescriptions that process the CAS, in order. If you have a mix of primitive and aggregate engines, then please create the AnalysisEngines yourself and call the other runPipeline method.- Throws:
IOException
- if there is an I/O problem in the readerResourceInitializationException
- if there is a problem initializing or running the pipeline.CollectionException
- if there is a problem initializing or running the pipeline.AnalysisEngineProcessException
- if there is a problem initializing or running the pipeline.
-
runPipeline
public static void runPipeline(CollectionReaderDescription readerDesc, AnalysisEngineDescription... descs) throws IOException, ResourceInitializationException, AnalysisEngineProcessException, CollectionException Run the CollectionReader and AnalysisEngines as a pipeline. After processing all CASes provided by the reader, the method calls
collectionProcessComplete()
on the engines,close()
on the reader anddestroy()
on the reader and all engines.External resources can be shared between the reader and the analysis engines.
This method is suitable for the batch-processing of sets of documents where the overheaded of instantiating the pipeline components does not significantly impact the overall runtime of the pipeline. If you need to avoid this overhead, e.g. because you wish to run a pipeline on individual documents, then you should not use this method. Instead, create a CAS using
JCasFactory
, create a reader instance usingCollectionReaderFactory.createReader(java.lang.String, java.lang.Object...)
, create an engine instance usingAnalysisEngineFactory.createEngine(java.lang.String, java.lang.Object...)
and then use a loop to process the data, resetting the CAS after each step.while (reader.hasNext()) { reader.getNext(cas); engine.process(cas); cas.reset(); }
- Parameters:
readerDesc
- The CollectionReader that loads the documents into the CAS.descs
- Primitive AnalysisEngineDescriptions that process the CAS, in order. If you have a mix of primitive and aggregate engines, then please create the AnalysisEngines yourself and call the other runPipeline method.- Throws:
IOException
- if there is an I/O problem in the readerResourceInitializationException
- if there is a problem initializing or running the pipeline.CollectionException
- if there is a problem initializing or running the pipeline.AnalysisEngineProcessException
- if there is a problem initializing or running the pipeline.
-
runPipeline
public static void runPipeline(CollectionReader reader, AnalysisEngine... engines) throws IOException, AnalysisEngineProcessException, ResourceInitializationException, CollectionException Provides a simple way to run a pipeline for a given collection reader and sequence of analysis engines. After processing all CASes provided by the reader, the method calls
collectionProcessComplete()
on the engines. Note thatResource.destroy()
andResource.destroy()
are NOT called. As the components were instantiated by the caller, they must also be managed (i.e. destroyed) the caller.External resources can only be shared between the reader and/or the analysis engines if the reader/engines have been previously instantiated using a shared resource manager.
The CAS is created using the resource manager used by the collection reader.
- Parameters:
reader
- a collection readerengines
- a sequence of analysis engines- Throws:
IOException
- if there is an I/O problem in the readerCollectionException
- if there is a problem initializing or running the pipeline.ResourceInitializationException
- if there is a problem initializing or running the pipeline.AnalysisEngineProcessException
- if there is a problem initializing or running the pipeline.
-
runPipeline
public static void runPipeline(ResourceManager aResMgr, CollectionReader reader, AnalysisEngine... engines) throws IOException, ResourceInitializationException, AnalysisEngineProcessException, CollectionException Provides a simple way to run a pipeline for a given collection reader and sequence of analysis engines. After processing all CASes provided by the reader, the method calls
collectionProcessComplete()
on the engines. Note thatResource.destroy()
andResource.destroy()
are NOT called. As the components were instantiated by the caller, they must also be managed (i.e. destroyed) the caller.External resources can only be shared between the reader and/or the analysis engines if the reader/engines have been previously instantiated using a shared resource manager.
- Parameters:
aResMgr
- a resource manager. Normally the same one used by the collection reader and analysis engines.reader
- a collection readerengines
- a sequence of analysis engines- Throws:
IOException
- if there is an I/O problem in the readerResourceInitializationException
- if there is a problem initializing or running the pipeline.CollectionException
- if there is a problem initializing or running the pipeline.AnalysisEngineProcessException
- if there is a problem initializing or running the pipeline.
-
runPipeline
public static void runPipeline(CAS aCas, AnalysisEngineDescription... aDescs) throws ResourceInitializationException, AnalysisEngineProcessException Run a sequence of
analysis engines
over aJCas
. The result of the analysis can be read from the JCas.External resources can be shared between the analysis engines.
- Parameters:
aCas
- the CAS to processaDescs
- a sequence of analysis engines to run on the jCas- Throws:
ResourceInitializationException
- if there is a problem initializing the componentsAnalysisEngineProcessException
- if there is a problem during the execution of the components
-
runPipeline
public static void runPipeline(JCas jCas, AnalysisEngineDescription... descs) throws AnalysisEngineProcessException, ResourceInitializationException Run a sequence of
analysis engines
over aJCas
. The result of the analysis can be read from the JCas.External resources can be shared between the analysis engines.
- Parameters:
jCas
- the jCas to processdescs
- a sequence of analysis engines to run on the jCas- Throws:
ResourceInitializationException
- if there is a problem initializing the componentsAnalysisEngineProcessException
- if there is a problem during the execution of the components
-
runPipeline
public static void runPipeline(JCas jCas, AnalysisEngine... engines) throws AnalysisEngineProcessException Run a sequence of
analysis engines
over aJCas
. This method does notdestroy
the engines or send them other events likeAnalysisEngine.collectionProcessComplete()
. This is left to the caller.External resources can only be shared between the analysis engines if the engines have been previously instantiated using a shared resource manager.
- Parameters:
jCas
- the jCas to processengines
- a sequence of analysis engines to run on the jCas- Throws:
AnalysisEngineProcessException
- if there is a problem during the execution of the components
-
runPipeline
public static void runPipeline(CAS cas, AnalysisEngine... engines) throws AnalysisEngineProcessException Run a sequence of
analysis engines
over aCAS
. This method does notdestroy
the engines or send them other events likeAnalysisEngine.collectionProcessComplete()
. This is left to the caller.External resources can only be shared between the analysis engines if the engines have been previously instantiated using a shared resource manager.
- Parameters:
cas
- the CAS to processengines
- a sequence of analysis engines to run on the jCas- Throws:
AnalysisEngineProcessException
- if there is a problem during the execution of the components
-
iteratePipeline
public static JCasIterable iteratePipeline(CollectionReaderDescription aReader, AnalysisEngineDescription... aEngines) Iterate through the
JCases
processed by the pipeline, allowing to access each one after it has been processed.External resources can be shared between the reader and the analysis engines.
-