Package org.apache.uima.cas.impl
Class CasCompare
java.lang.Object
org.apache.uima.cas.impl.CasCompare
Used by tests for Binary Compressed de/serialization code.
Used by test app: XmiCompare.
Compare 2 CASes, with perhaps different type systems.
If the type systems are different, construct a type mapper and use that
to selectively ignore types or features not in other type system
The Mapper is from CAS1 -> CAS2
When computing the things to compare from CAS1, filter to remove
feature structures not reachable via indexes or refs
The index definitions are not compared.
The indexes are used to locate the FSs to be compared.
Reports are produced to System.out and System.err as a side effect
System.out: status messages, type system comparison
System.err: mismatch comparison information
Usage:
Use the static compareCASes method for default comparisons
Use the multi-step approach for more complex comparisons:
- Make an instance of this class, passing in the two CASes.
- Set any additional configuration
cc.compareAll(true) - continue comparing if mismatch found
cc.compardIds(true) - compare ids (require ids to be ==)
- Do any transformations needed on the CASes to account for known but allowed differences:
-- These are transformations done on the CAS Feature Structures outside of this routine
-- example: for certain type:feature string values, normalize to the same canonical value
-- example: for certain type:feature string arrays, where the order is not important, sort them
-- example: for certain type:feature FSArrays, where the order is not important, sort them
--- using the sortFSArray method
- Do any configuration to specify congruence sets for String values
-- example: addStringCongruenceSet( type, feature, set-of-strings, -1 or int index if array)
-- these are specific to type / feature specs
-- range can be string or string array - if string array, the spec includes the index or -1
to indicate all indexes
How it works
Prepare arrays of all the FSs in the CAS
- for each of 2 CASes to be compared
- 2 arrays:
-- all FSs in any index in any view
-- the above, plus all FSs reachable via references
-- but omit some types: only of interest when reached via ref,
e.g. String/Int/Float/Boolean arrays
The comparison of FSs is done, one FS at a time.
- in order to determine the right FSs to compare with each other, the FSs for each CAS
are sorted.
The sort and the CAS compare both use a Compare method.
- sorting skips items not in the other type system, including features
- (only possible if comparing two CASes with different type systems, of course)
Compare
- used for two purposes:
a) sorting FSs belonging to one CAS
- can be used by caller to pre-sort any array values where the
compare should be for set equality (in other words, ignore the order)
b) comparing a FS in one CAS with a FS in the other CAS
sort keys, in order:
1) type
2) if primitive array: sort based on
- size
- iterating thru all array items
3) All the features, considered in an order where non-refs are sorted before refs.
comparing values:
primitives - value comparison
refs - compare the ref'd FS, while recording reference paths
- stop when reach a compare point where the pair being compared has been seen
- stop at any point if the two FSs compare unequal
- at the stop point, if compare is equal, check the reference paths, and
report unequal reference paths (different cycle lengths, or different total lengths,
see the Prev data structure)
Context information, reused across compares:
prevCompare - if a particular pair of FSs compared equal
-- used to speed up comparison
-- used to stop recursive loops of references
prev1, prev2 - reset for each top level FS compare
- not reset for inner FS compares of fs-reference values)
holds information about the reference path for chains of FS references
-
Constructor Summary
ConstructorDescriptionCasCompare
(CASImpl c1, CASImpl c2) Make an instance of this class to set up a compare operation, and optionally use to configure the compare. -
Method Summary
Modifier and TypeMethodDescriptionvoid
addStringCongruenceSet
(String typeName, String featureBaseName, String[] set_of_strings_that_are_equivalent, int index) Add a set of strings that should be considered equal when doing string comparisons.void
Many times some customation needs to be applied to both CASs being compared.void
Before comparing, you can adjust specific features of specific types, arbitrarily.void
canonicalizeString
(String typeName, String featureBaseName, String[] items_to_change, String canonical_value) Before comparing, you can, for a selected type and feature which has a string value belonging to one of a set of strings, change the value to another (fixed) string which will of course compare equal.void
compareAll
(boolean v) Continues the comparison after a miscompare (or not).boolean
This does the actual comparison operation of the previously specified CASesstatic boolean
compareCASes
(CASImpl c1, CASImpl c2) Compare 2 CASes, with perhaps different type systems. - using default configuration.void
compareIds
(boolean v) Normally, compares ignore the Feature Structure ID when comparing.static StringBuilder
compareNumberOfFSsByType
(CAS cas1, CAS cas2) Counts and compares the number of Feature Structures, by type, and generates a reportvoid
The compare can find FeatureStructures to compare either from - being in some index in some view, or - being referenced through some chain which starts with the above.void
The compare can find FeatureStructures to compare either from - being in some index in some view, or - being referenced through some chain which starts with the above.void
excludeRootTypesFromIndexes
(Set<String> excluded_typeNames) The compare can find FeatureStructures to compare either from - being in some index in some view, or - being referenced through some chain which starts with the above.void
includeOnlyTheseTypesFromIndexes
(List<String> includedTypeNames) The compare can find FeatureStructures to compare either from - being in some index in some view, or - being referenced through some chain which starts with the above.static void
call this to show progress of the compare - useful for long comparessort_dedup_FSArray
(String typeName, String featureBaseName) sort_dedup_FSArray
(TOP fs, Feature feat) This is an optional pre-compare operation.sortFSArray
(String typeName, String featureBaseName) sortFSArray
(FSArray<?> fsArray) This is an optional pre-compare operation.sortStringArray
(String typeName, String featureBaseName) sortStringArray
(StringArray stringArray) This is an optional pre-compare operation.type_feature_to_runnable
(String typeName, String featureBaseName, BiFunction<TOP, Feature, Runnable> c) Before comparing, you can create pending values for specific types / features, and return a list of runnables, which when run, plug in those pending values.
-
Constructor Details
-
CasCompare
Make an instance of this class to set up a compare operation, and optionally use to configure the compare.- Parameters:
c1
- one CAS to comparec2
- the other CAS to compare
-
-
Method Details
-
compareCASes
Compare 2 CASes, with perhaps different type systems. - using default configuration.- Parameters:
c1
- CAS to comparec2
- CAS to compare- Returns:
- true if equal (for types / features in both)
-
compareAll
public void compareAll(boolean v) Continues the comparison after a miscompare (or not). This is useful when you want to see all of the miscompares.- Parameters:
v
- defaults to false, set to true to continue the comparison after a miscompare
-
compareIds
public void compareIds(boolean v) Normally, compares ignore the Feature Structure ID when comparing.- Parameters:
v
- defaults to false, set to true to include the Feature Structure ID in the compare.
-
applyToBoth
Many times some customation needs to be applied to both CASs being compared. This routine does that- Parameters:
c
- the customization to be applied to both CASs
-
applyToTypeFeature
Before comparing, you can adjust specific features of specific types, arbitrarily. This routine applies the adjustments to both CASs.- Parameters:
typeName
- the fully qualified name of the typefeatureBaseName
- the short feature name to adjustc
- a function to do the adjustment
-
type_feature_to_runnable
public List<Runnable> type_feature_to_runnable(String typeName, String featureBaseName, BiFunction<TOP, Feature, Runnable> c) Before comparing, you can create pending values for specific types / features, and return a list of runnables, which when run, plug in those pending values.- Parameters:
typeName
- the typefeatureBaseName
- the feature of the typec
- the code to run for this type and feature- Returns:
- a list of runnables, for both CASs
-
canonicalizeString
public void canonicalizeString(String typeName, String featureBaseName, String[] items_to_change, String canonical_value) Before comparing, you can, for a selected type and feature which has a string value belonging to one of a set of strings, change the value to another (fixed) string which will of course compare equal. Use this to ignore selected string-valued features having particular values.- Parameters:
typeName
- the fully qualified type namefeatureBaseName
- the featureitems_to_change
- an array of strings to change if matched to one of thesecanonical_value
- the new value
-
sortFSArray
-
sort_dedup_FSArray
-
sortStringArray
-
excludeRootTypesFromIndexes
The compare can find FeatureStructures to compare either from - being in some index in some view, or - being referenced through some chain which starts with the above. It sometimes helps to exclude miscompares of FeatureStructure like StringArrays which (for some reason) are indexed, in favor of finding these only via refs. You can exclude these from being found via indexes by setting types here. They could still be found via refs from other Feature Structures. Calling this disables any includeOnlyTheseTypesFromIndexes call;- Parameters:
excluded_typeNames
- type names to exclude
-
excludeCollectionsTypesFromIndexes
public void excludeCollectionsTypesFromIndexes()The compare can find FeatureStructures to compare either from - being in some index in some view, or - being referenced through some chain which starts with the above. It sometimes helps to exclude miscompares of FeatureStructure like StringArrays which (for some reason) are indexed, in favor of finding these only via refs. Call this to exclude the array types: boolean, byte, short, integer, long, float, double, string and fs arrays from being found via indexes. They could still be found via refs from other Feature Structures. Calling this disables any includeOnlyTheseTypesFromIndexes call; -
excludeListTypesFromIndexes
public void excludeListTypesFromIndexes()The compare can find FeatureStructures to compare either from - being in some index in some view, or - being referenced through some chain which starts with the above. It sometimes helps to exclude miscompares of List FeatureStructures like StringLists which (for some reason) are indexed, in favor of finding these only via refs. Call this to exclude the list types non-empty Float/Integer/String list elements from being found in the index. They could still be found via refs from other Feature Structures. Calling this disables any includeOnlyTheseTypesFromIndexes call; -
includeOnlyTheseTypesFromIndexes
The compare can find FeatureStructures to compare either from - being in some index in some view, or - being referenced through some chain which starts with the above. It sometimes helps to exclude all types except for a few selected ones which are indexed, in favor of finding these only via refs. Calling this disables any excludeXXXTypesFromIndexes calls;- Parameters:
includedTypeNames
- fully qualified type names to include when finding Feature Structures to compare via the indexes.
-
addStringCongruenceSet
public void addStringCongruenceSet(String typeName, String featureBaseName, String[] set_of_strings_that_are_equivalent, int index) Add a set of strings that should be considered equal when doing string comparisons. This is conditioned on the typename and feature name- Parameters:
typeName
- the fully qualified type namefeatureBaseName
- the feature short nameset_of_strings_that_are_equivalent
- a set of strings that should compare equal, if testing the type / featureindex
- if the item being compared is a reference to a string array, which index should be compared. Use -1 if not applicable.
-
showProgress
public static void showProgress()call this to show progress of the compare - useful for long compares -
compareCASes
public boolean compareCASes()This does the actual comparison operation of the previously specified CASes- Returns:
- true if compare is OK
-
sortFSArray
This is an optional pre-compare operation. Somtimes, when comparing FSArrays, the order of the elements is not significant, and the compare should be done ignoring order differences. This is accomplished by sorting the elements, before the compare is done, using this method. The sort order is not significant; it just needs to be the same order for otherwise equal FSArrays. Use this routine to accomplish the sort, on particular FSArrays you designate. Call it for each one you want to sort. During the sort, links are followed. The sorting is done in a clone of the array, and the original array is not updated. Instead, a Runnable is returned, which may be invoked later to update the original array with the sorted copy. This allows sorting to be done on the original item values (in case the links refer back to the originals)- Parameters:
fsArray
- the array to be sorted- Returns:
- a runnable, which (when invoked) updates the original array with the sorted result.
-
sort_dedup_FSArray
This is an optional pre-compare operation. It is identical to the method above, except that after sorting, it removes duplicates.- Parameters:
fs
- the feature structure having the fsarray featurefeat
- the feature having the fsarray- Returns:
- a runnable, which (when invoked) updates the original array with the sorted result.
-
sortStringArray
This is an optional pre-compare operation. Somtimes, when comparing StringArrays, the order of the elements is not significant, and the compare should be done ignoring order differences. This is accomplished by sorting the elements, before the compare is done, using this method. Use this routine to accomplish the sort, on particular StringArrays you designate. Call it for each one you want to sort. The sorting is done in a clone of the array, and the original array is not updated. Instead, a Runnable is returned, which may be invoked later to update the original array with the sorted copy. This allows sorting to be done while keeping the original values until a later time- Parameters:
stringArray
- the array to be sorted- Returns:
- null or a runnable, which (when invoked) updates the original array with the sorted result. callers should insure the runnable is garbage collected after use
-
compareNumberOfFSsByType
Counts and compares the number of Feature Structures, by type, and generates a report- Parameters:
cas1
- first CAS to comparecas2
- second CAS to compare- Returns:
- a StringBuilder with a report
-