NOTE: This information is from an email to the uima-dev list by one of the DUCC committers.
In this email, they reference a "glusterfs" which is a shared file system cross mounted
on various machines. This "shared file system" approach is the conventional way DUCC is
configured, but it can be configured without a shared file system; see the DUCC docs.
We've been running DUCC in a cloud environment for almost a year now. The
DUCC master and a
servers run on bare metal and all of the
workstations and worker machines run on VMs. Cluster users add VMs to the
cluster as needed. A job can be started on one more workers and then
additional VMs dynamically added to which the job will automatically scale
out to use. A common system image is maintained on all VM machines via an
LDAP server and shared filesystem data. Users belong to groups and share
machines allocated by members of the group.
A DUCC VM-image is used to automatically connect new VMs to the DUCC master
and glusterfs. The DUCC master configuration may be updated anytime, for
example to add new groups or even update master software. VMs automatically
sync DUCC software and configuration each time they start their DUCC agent.
The VM image supports three different machine types: a graphical
workstation, a CPU worker and a GPU worker (used typically for deep learning
training). DUCC spawns work on specified
worker machine types and even specific machines. Workstations are optional
as DUCC requests can be submitted from worker machines. Docker images are
supported using Podman. Podman runs rootless and only allows access to all
mounted file systems with user credentials.
In order to keep some level of data security, a group directory is only
mounted on the VMs created by members of the group. Individual users
maintain file permissions as desired, but, as anyone that creates a VM has
root access, they could become any other user and access data from other
group members.There is a self-service glusterfs webapp that is used to
export group data to new VMs and manage quotas.
The VM-image builder and glusterfs webapp are not yet part of Apache DUCC.