The tutorial is split into two theoretic parts, followed by a hands-on session.
The first part gives an introduction to doing UIMA-based natural language processing
using the DKPro Core
component collection. It will be shown how a processing pipeline
can be composed from existing components, such as a tokenizer, part-of-speech tagger, and parser.
To illustrate how such a pipeline can be used on own data and how analysis results can
be exported to other tools, we also show how to implement simple reader and writer components.
This includes a basic introduction to the underlying concepts, such as
Maven and uimaFIT.
The second part gives an introduction to Apache UIMA Ruta. We will cover the syntax
and semantics of the rule language as well as tooling support for developing rule-based
information extraction applications.
In the hands-on session, we will implement and run a simple pipeline with DKPro Core.
We then will extend the pipeline with rule-based post-processing in order to approach
and solve different tasks such as information extraction. This includes solutions how
to combine the rules with the information annotated by the DKPro components and ways for
efficient and effective engineering of the rules themselves.
Resources:
Organizers: Peter Kluegl, Dr. Katrin Tomanek, Richard Eckart de Castilho