Factories to create different kinds of UIMA resource specifiers.

Why are descriptors better than component instances?

It is recommended to avoid instantiating components with uimaFIT outside of a running pipeline, unless necessary and unless you are aware of the consequences. When run within a pipeline, such as SimplePipeline or within a Collection Processing Engine, the pipeline logic takes care of invoking the life-cycle methods on a component, such as:
  • initialize
  • collectionProcessComplete
  • destroy
  • ...
When components are created manually, it is the responsability of the caller to explicitly invoke the life-cycle methods. The only method that uimaFIT may call is initialize to provide an UimaContext with the desired parametrization of the component. Not letting UIMA/uimaFIT manage the life-cycle of a component can, thus, have some unexpected effects. For example, a CollectionReader cannot be reused after it has been passed to a SimplePipeline.runPipeline(org.apache.uima.collection.CollectionReader, org.apache.uima.analysis_engine.AnalysisEngine...). The pipeline reads all files from the reader instance, and when it is complete, the reader does not have any more data to produce. Passing the reader to subsequent runPipeline methods will not produce any results. When a CollectionReaderDescription is passed instead, the reader is created, initalized, and destroyed inside the runPipeline method. The description can be passed to multiple runPipeline calls and each time, it will behave the same way, producing all its data.