1. What is UIMA-DUCC?
2. Major Changes in this Release
3. Limitations in this Release
DUCC stands for Distributed UIMA Cluster Computing. DUCC is a cluster management system providing tooling, management, and scheduling facilities to automate the scale-out of applications written to the UIMA framework. Core UIMA provides a generalized framework for applications that process unstructured information such as human language, but does not provide a scale-out mechanism. UIMA-AS provides a scale-out mechanism to distribute UIMA pipelines over a cluster of computing resources, but does not provide job or cluster management of the resources. DUCC defines a formal job model that closely maps to a standard UIMA pipeline. Around this job model DUCC provides cluster management services to automate the scale-out of UIMA pipelines over computing clusters.
UIMA DUCC 1.1.0 Apache is a maintenance release containing bug fixes and a few
new features. What's new:
All registered groups are set for processes. User may set DUCC_UMASK to establish the umask for a processes.
Administrative CLI interface Vary-off a node to temporarily exclude it from scheduling Vary-on a node to return it to the scheduling pool Query occupancy - for each node, shows what is scheduled there Query load - summary of scheduling tables to allow external entities such as LSF to collaborate with DUCC scheduler Misc enhancements Better handling of failed nodes, purges all work other than reservations Improved de-fragmentation logic Improved handling of small clusters Improved eviction, takes into account the amount of work that would be lost before scheduling a process for eviction.
Added Node visualizationFor a complete list of issues fixed and up-to-date information on UIMA-DUCC issues, see our issue tracker: https://issues.apache.org/jira/issues/?jql=project%20%3D%20UIMA%20AND%20fixVersion%20%3D%20%221.1.0-Ducc%22%20ORDER%20BY%20key%20ASC