|
Apache uimaFIT
|
Overview
|
Configuring UIMA components is generally achieved by creating XML descriptor
files which tell the framework at runtime how components should be
instantiated and deployed. These XML descriptor files are very tightly
coupled with the Java implementation of the components they describe.
We have found that it is very difficult to keep the two consistent
with each other especially when code refactoring is very frequent.
uimaFIT provides Java annotations for describing UIMA components which
can be used to directly describe the UIMA components in the code. This
greatly simplifies refactoring a component definition (e.g. changing a
configuration parameter name). It also makes it possible to generate
XML descriptor files as part of the build cycle rather than being
performed manually in parallel with code creation. uimaFIT also makes
it easy to instantiate UIMA components without using XML descriptor
files at all by providing a number of convenience factory methods
which allow programmatic/dynamic instantiation of UIMA components.
This makes uimaFIT an ideal library for testing UIMA components
because the component can be easily instantiated and invoked without
requiring a descriptor file to be created first. uimaFIT is also
helpful in research environments in which programmatic/dynamic
instantiation of a pipeline can simplify experimentation. For example,
when performing 10-fold cross-validation across a number of
experimental conditions it can be quite laborious to create a
different set of descriptor files for each run or even a script that
generates such descriptor files. uimaFIT is type system agnostic and
does not depend on (or provide) a specific type system.
Apache uimaFIT is a library that provides factories, injection, and testing utilities for
Apache UIMA™. The following list highlights some of the features uimaFIT provides:
-
Factories: simplify instantiating UIMA components programmatically without descriptor files.
For example, to instantiate an AnalysisEngine a call like this could be made:
AnalysisEngineFactory.createEngine(MyAEImpl.class, myTypeSystem,
paramName1, paramValue1,
paramName2, paramValue2,
...)
-
Injection: handles the binding of configuration parameter values to the corresponding member
variables in the analysis engines and handles the binding of external resources. For example,
to bind a configuration parameter just annotate a member variable with @ConfigurationParameter .
Then add one line of code to your initialize method:
ConfigurationParameterInitializer.initialize(this, uimaContext)
This is handled automatically if you extend the uimaFIT component base classes, such as
JCasAnnotator_ImplBase . External resources can likewise by injected via the
@ExternalResource annotation.
-
Testing: uimaFIT simplifies testing in a number of ways described in the documentation. By making
it easy to instantiate your components without descriptor files a large amount of
difficult-to-maintain and unnecessary XML can be eliminated from your test code. This makes tests
easier to write and maintain. Also, running components as a pipeline can be accomplished with a
method call like this:
SimplePipeline.runPipeline(reader, ae1, ..., aeN, consumer1, ... consumerN)
uimaFIT is a part of the Apache UIMA project. uimaFIT can only be used in conjunction with
a compatible version of the Java version of the Apache UIMA SDK. For your convenience, the binary
distribution package of uimaFIT includes all libraries necessary to use uimaFIT. In particular for
novice users, it is strongly advised to obtain a copy of the full UIMA SDK separately.
|
Modules
|
-
uimafit-core
is the main uimaFIT module.
-
uimafit-cpe
provides support for the Collection Processing Engine (multi-threaded pipelines).
-
uimafit-maven-plugin
is a Maven plugin to automatically enhance UIMA components with uimaFIT
metadata and to generate XML descriptors for uimaFIT-enabled components.
-
uimafit-spring
is an experimental module serving as a proof-of-concept for the integration of
UIMA with the Spring Framework. It is currently not considered finished and
uses invasive reflection in order to patch the UIMA framework such that it
passes all components created by UIMA through Spring to provide for the
wiring of Spring context dependencies. This module is made available for
the adventurous but currently not considered stable, finished, or even a
proper part of the package. E.g. it is not included in the binary
distribution package.
-
uimafit-legacy-support (only uimaFIT < 3.0.0)
allows uimaFIT 2.x to use uimaFIT 1.4.x meta data like Java annotations
and
META-INF/org.uimafit/types.txt files. Pipelines mixing uimaFIT 1.4.x
and 2.x componens MUST be created using the 2.x factories, because the
1.4.x factories will not understand how to handle uimaFIT 2.x components
or auto-configuration.
|
Documentation
|
Here, you can find the documentation for the most recent uimaFIT release compatible with the UIMA Java
SDK v3 and v2 respectively.
Latest uimaFIT v3.x documentation
Latest uimaFIT v2.x documentation
Should you require documentation for a specific version of uimaFIT, please check our archive.
|
Developer Information
|
The latest version of uimaFIT is available via Maven Central.
If you use Maven as your build tool, then you can add uimaFIT as a dependency
in your pom.xml file (additionally to other UIMA dependencies). Please mind to change the version
in the dependency declaration to the version you actually want to use:
<dependency>
<groupId>org.apache.uima</groupId>
<artifactId>uimafit-core</artifactId>
<version>3.3.0</version>
</dependency>
To build uimaFIT from the repository, clone the git repository at
https://github.com/apache/uima-uimafit.git and execute
mvn clean install at the root of the cloned repository.
The sources of the current release are available at the download page.
|
Reference
|
If you use uimaFIT to support academic research, then please consider citing the following
paper as appropriate:
@InProceedings{ogren-bethard:2009:SETQA-NLP,
author = {Ogren, Philip and Bethard, Steven},
title = {Building Test Suites for {UIMA} Components},
booktitle = {Proceedings of the Workshop on Software Engineering, Testing,
and Quality Assurance for Natural Language Processing (SETQA-NLP 2009)},
month = {June},
year = {2009},
address = {Boulder, Colorado},
publisher = {Association for Computational Linguistics},
pages = {1--4},
url = {https://www.aclweb.org/anthology/W/W09/W09-1501}
}
|
History
|
Since end of 2012, uimaFIT is part of the Apache UIMA project.
Apache uimaFIT was formerly known as uimaFIT, which in turn was formerly known as UUTUC.
Before uimaFIT has become an sub-project within the Apache UIMA project, it is was collaborative
effort between the Center for Computational Pharmacology at the University of Colorado Denver, the
Center for Computational Language and Education Research at the University of Colorado at Boulder,
and the Ubiquitous Knowledge Processing (UKP) Lab at the Technische Universitaet Darmstadt.
The initial uimaFIT development team was:
- Philip Ogren, University of Colorado, USA
- Richard Eckart de Castilho, Technische Universitaet Darmstadt, Germany
- Steven Bethard, Stanford University, USA
with contributions from Niklas Jakob, Fabio Mancinelli, Chris Roeder, Philipp Wetzler, Shuo Yang,
and Torsten Zesch.
|
|
|