Class BinaryCasSerDes4

java.lang.Object
org.apache.uima.cas.impl.BinaryCasSerDes4
All Implemented Interfaces:
SlotKindsConstants

public class BinaryCasSerDes4 extends Object implements SlotKindsConstants
User callable serialization and deserialization of the CAS in a compressed Binary Format This serializes/deserializes the state of the CAS, assuming that the type information remains constant. Header specifies to reader the format, and the compression level. How to Serialize: 1) create an instance of this class, specifying some options that don't change very much 2) call serialize(CAS) to serialize the cas * You can reuse the instance for a different CAS (as long as the type system is the same); this will save setup time. This class lazily constructs customized TypeInfo instances for each type encountered in serializing. These are preserved across multiple serialization calls, so their setup / initialization is only needed the first time. The form of the binary CAS is inserted at the beginning so that receivers can do the proper deserialization. Binary format requires that the exact same type system be used when deserializing How to Deserialize: 1) get an appropriate CAS to deserialize into. For delta CAS, it does not have to be empty. 2) call CASImpl: cas.reinit(inputStream) This is the existing method for binary deserialization, and it now handles this compressed version, too. Delta cas is also supported. Compression/Decompression Works in two stages: application of Zip/Unzip to particular sub-collections of CAS data, grouped according to similar data distribution collection of like kinds of data (to make the zipping more effective) There can be up to ~20 of these collections, such as control info, float-exponents, string chars Deserialization: Read all bytes, create separate ByteArrayInputStreams for each segment, sharing byte bfr create appropriate unzip data input streams for these Properties of Form 4: 1) (Change from V2) Indexes are used to determine what gets serialized, because there's no "heap" to walk, unless the v2-id-mode is in effect. 2) The number used for references to FSs is a sequentially incrementing one, starting at 1 This allows better compression.