Package org.apache.uima.cas.impl
Class CasSerializerSupport
java.lang.Object
org.apache.uima.cas.impl.CasSerializerSupport
CAS serializer support for XMI and JSON formats.
There are multiple use cases.
1) normal - the consumer is independent of UIMA
- (maybe) support for delta serialization
2) service calls:
- support deserialization with out-of-type-system set-aside, and subsequent serialization with re-merging
- guarantee of using same xmi:id's as were deserialized when serializing
- support for delta serialization
There is an outer class (one instance per "configuration" - reusable after configuration, and
an inner class - one per serialize call.
These classes are the common parts of serialization between XMI and JSON, mainly having to do with
1) enqueueing the FS to be serialized
2) serializing according to their types and features
Methods marked public are not for public use but are that way to permit
other users of this class in other packages to "see" these methods.
XmiCasSerializer JsonCasSerializer
Instance Instance
css ref -------> CasSerializerSupport <------ css ref
XmiDocSerializer JsonDocSerializer
Instance Instance
(1 per serialize action) (1 per serialize action)
cds ref -------> CasDocSerializer <------- cds ref
csss points back
Construction:
new Xmi/JsonCasSerializer
initializes css with new CasSerializerSupport
serialize method creates a new Xmi/JsonDocSerializer inner class
constructor creates a new CasDocSerializer,
Use Cases and Algorithms
Support set-aside for out-of-type-system FS on deserialization (record in shareData)
implies can't determine sharing status of things ref'd by features; need to depend on
multiple-refs-allowed flag.
If multiple-refs found during serialization for feat marked non-shared, unshare these (make
2 serializations, one or more inplace, for example.
Perhaps not considered an error.
implies need (for non-delta case) to send all FSs that were deserialized - some may be ref'd by oots elements
** Could ** not do this if no oots elements, but could break some assumptions
and this only would apply to non-delta - not worth doing
Enqueuing:
There are two styles
- enqueueCommon: does **NOT** recursively enqueue features
- enqueue: calls enqueueCommon and then recursively enqueues features
enqueueCommon is called (bypassing enqueue) to defer scanning references
Order and target of enqueuing:
- things in the index
-- put on "queue"
-- first, the sofa's (which are the only things indexed in base view)
-- next, for each view, for each item, the FSs, but **NOT** following any feature/array refs
- things not in the index, but deserialized (incoming)
-- put on previouslySerializedFSs, no recursive descent for features
- (delta) enqueueNonsharedMultivaluedFS (lists and arrays)
-- put on modifiedEmbeddedValueFSs, no recursive descent for features
- recursive descent for
-- things in previouslySerializedFSs,
-- things in modifiedEmbeddedValueFSs
-- things in the index
The recursive descent is recursive, and an arbitrary long chain can get stack overflow error.
TODO Probably should fix this someday. See https://issues.apache.org/jira/browse/UIMA-106 *
-
Nested Class Summary
Modifier and TypeClassDescriptionclass
Use an inner class to hold the data for serializing a CAS.static class
-
Field Summary
Modifier and TypeFieldDescriptionstatic final Comparator<TypeImpl>
Comparator that just uses short name Public for access by JsonCasSerializer where it's needed for a binary search https://issues.apache.org/jira/browse/UIMA-5171static AtomicInteger
boolean
static int
static int
static final int
static final int
static final int
static final int
-
Constructor Summary
-
Method Summary
Modifier and TypeMethodDescriptionstatic final int
classifyType
(TypeImpl ti) Classifies a type.set an error handler to receive information about errorspass in a type system to use for filtering what gets serialized; only those types and features which are defined this type system are included.setPrettyPrint
(boolean pp) set or reset the pretty print flag (default is false)
-
Field Details
-
TYPE_CLASS_INTLIST
public static final int TYPE_CLASS_INTLIST- See Also:
-
TYPE_CLASS_FLOATLIST
public static final int TYPE_CLASS_FLOATLIST- See Also:
-
TYPE_CLASS_STRINGLIST
public static final int TYPE_CLASS_STRINGLIST- See Also:
-
TYPE_CLASS_FSLIST
public static final int TYPE_CLASS_FSLIST- See Also:
-
PP_LINE_LENGTH
public static int PP_LINE_LENGTH -
PP_ELEMENTS
public static int PP_ELEMENTS -
errorCount
-
COMPARATOR_SHORT_TYPENAME
Comparator that just uses short name Public for access by JsonCasSerializer where it's needed for a binary search https://issues.apache.org/jira/browse/UIMA-5171 -
isFormattedOutput
public boolean isFormattedOutput
-
-
Constructor Details
-
CasSerializerSupport
public CasSerializerSupport()C O N S T R U C T O R S *
-
-
Method Details
-
setPrettyPrint
set or reset the pretty print flag (default is false)- Parameters:
pp
- true to do pretty printing of output- Returns:
- the original instance, possibly updated
-
setFilterTypes
pass in a type system to use for filtering what gets serialized; only those types and features which are defined this type system are included.- Parameters:
ts
- the filter- Returns:
- the original instance, possibly updated
-
getFilterTypes
-
setErrorHandler
set an error handler to receive information about errors- Parameters:
eh
- the error handler- Returns:
- the original instance, possibly updated
-
classifyType
Classifies a type. This returns an integer code identifying the type as one of the primitive types, one of the array types, one of the list types, or a generic FS type (anything else).The
LowLevelCAS.ll_getTypeClass(int)
method classifies primitives and array types, but does not have a special classification for list types, which we need for XMI serialization. Therefore, in addition to the type codes defined onLowLevelCAS
, this method can return one of the type codes TYPE_CLASS_INTLIST, TYPE_CLASS_FLOATLIST, TYPE_CLASS_STRINGLIST, or TYPE_CLASS_FSLIST.- Parameters:
ti
- the type to classify- Returns:
- one of the TYPE_CLASS codes defined on
LowLevelCAS
or on this interface.
-