At the center of XMLUnit's support for comparisons is the
DifferenceEngine class. In practice you
rarely deal with it directly but rather use it via instances of
Diff or DetailedDiff
classes (see Section 3.5, “Diff and
DetailedDiff”).
The DifferenceEngine walks two trees of
DOM Nodes, the control and the test tree, and
compares the nodes. Whenever it detects a difference, it sends
a message to a configured DifferenceListener
(see Section 3.3, “DifferenceListener”) and asks a
ComparisonController (see Section 3.2, “ComparisonController”) whether the current comparison
should be halted.
In some cases the order of elements in two pieces of XML
may not be significant. If this is true, the
DifferenceEngine needs help to determine
which Elements to compare. This is the job
of an ElementQualifier (see Section 3.4, “ElementQualifier”).
The types of differences
DifferenceEngine can detect are enumerated in
the DifferenceConstants interface and
represented by instances of the Difference
class.
A Difference can be recoverable;
recoverable Differences make the
Diff class consider two pieces of XML similar
while non-recoverable Differences render the
two pieces different.
The types of Differences that are
currently detected are listed in Table 1, “Document level Differences detected by
DifferenceEngine”
to Table 4, “Other Differences detected by
DifferenceEngine” (the first two columns refer to
the DifferenceConstants class).
Table 1. Document level Differences detected by
DifferenceEngine
ID | Constant | recoverable | Description |
|---|---|---|---|
HAS_DOCTYPE_DECLARATION_ID | HAS_DOCTYPE_DECLARATION | true | One piece of XML has a DOCTYPE declaration while the other one has not. |
DOCTYPE_NAME_ID | DOCTYPE_NAME | false | Both pieces of XML contain a DOCTYPE declaration but the declarations specify different names for the root element. |
DOCTYPE_PUBLIC_ID_ID | DOCTYPE_PUBLIC_ID | false | Both pieces of XML contain a DOCTYPE declaration but the declarations specify different PUBLIC identifiers. |
DOCTYPE_SYSTEM_ID_ID | DOCTYPE_SYSTEM_ID | true | Both pieces of XML contain a DOCTYPE declaration but the declarations specify different SYSTEM identifiers. |
NODE_TYPE_ID | NODE_TYPE | false | The test piece of XML contains a different type
of node than was expected. This type of difference will
also occur if either the root control or test
Node is null while
the other is not. |
NAMESPACE_PREFIX_ID | NAMESPACE_PREFIX | true | Two nodes use different prefixes for the same XML Namespace URI in the two pieces of XML. |
NAMESPACE_URI_ID | NAMESPACE_URI | false | Two nodes in the two pieces of XML share the same local name but use different XML Namespace URIs. |
SCHEMA_LOCATION_ID | SCHEMA_LOCATION | true | Two nodes have different values for the
schemaLocation attribute of the
XMLSchema-Instance namespace. The attribute could be
present on only one of the two nodes. |
NO_NAMESPACE_SCHEMA_LOCATION_ID | NO_NAMESPACE_SCHEMA_LOCATION | true | Two nodes have different values for the
noNamespaceSchemaLocation attribute
of the XMLSchema-Instance namespace. The attribute
could be present on only one of the two nodes. |
Table 2. Element level Differences detected by
DifferenceEngine
ID | Constant | recoverable | Description |
|---|---|---|---|
ELEMENT_TAG_NAME_ID | ELEMENT_TAG_NAME | false | The two pieces of XML contain elements with different tag names. |
ELEMENT_NUM_ATTRIBUTES_ID | ELEMENT_NUM_ATTRIBUTES | false | The two pieces of XML contain a common element, but the number of attributes on the element is different. |
HAS_CHILD_NODES_ID | HAS_CHILD_NODES | false | An element in one piece of XML has child nodes while the corresponding one in the other has not. |
CHILD_NODELIST_LENGTH_ID | CHILD_NODELIST_LENGTH | false | Two elements in the two pieces of XML differ by their number of child nodes. |
CHILD_NODELIST_SEQUENCE_ID | CHILD_NODELIST_SEQUENCE | true | Two elements in the two pieces of XML contain the same child nodes but in a different order. |
CHILD_NODE_NOT_FOUND_ID | CHILD_NODE_NOT_FOUND | false | A child node in one piece of XML couldn't be matched against any other node of the other piece. |
ATTR_SEQUENCE_ID | ATTR_SEQUENCE | true | The attributes on an element appear in different order[a] in the two pieces of XML. |
[a] Note that the order of attributes is not significant in XML, different parsers may return attributes in a different order even if parsing the same XML document. There is an option to turn this check off - see Section 3.8, “Configuration Options” - but it is on by default for backwards compatibility reasons | |||
Table 3. Attribute level Differences detected by
DifferenceEngine
ID | Constant | recoverable | Description |
|---|---|---|---|
ATTR_VALUE_EXPLICITLY_SPECIFIED_ID | ATTR_VALUE_EXPLICITLY_SPECIFIED | true | An attribute that has a default value according to the content model of the element in question has been specified explicitly in one piece of XML but not in the other.[a] |
ATTR_NAME_NOT_FOUND_ID | ATTR_NAME_NOT_FOUND | false | One piece of XML contains an attribute on an element that is missing in the other. |
ATTR_VALUE_ID | ATTR_VALUE | false | The value of an element's attribute is different in the two pieces of XML. |
[a] In order for this difference to be detected the parser must have been in validating mode when the piece of XML was parsed and the DTD or XML Schema must have been available. | |||
Table 4. Other Differences detected by
DifferenceEngine
ID | Constant | recoverable | Description |
|---|---|---|---|
COMMENT_VALUE_ID | COMMENT_VALUE | false | The content of two comments is different in the two pieces of XML. |
PROCESSING_INSTRUCTION_TARGET_ID | PROCESSING_INSTRUCTION_TARGET | false | The target of two processing instructions is different in the two pieces of XML. |
PROCESSING_INSTRUCTION_DATA_ID | PROCESSING_INSTRUCTION_DATA | false | The data of two processing instructions is different in the two pieces of XML. |
CDATA_VALUE_ID | CDATA_VALUE | false | The content of two CDATA sections is different in the two pieces of XML. |
TEXT_VALUE_ID | TEXT_VALUE | false | The value of two texts is different in the two pieces of XML. |
Note that some of the differences listed may be ignored by
the DifferenceEngine if certain configuration
options have been specified. See Section 3.8, “Configuration Options” for details.
DifferenceEngine passes differences
found around as instances of the Difference
class. In addition to the type of of difference this class also
holds information on the nodes that have been found to be
different. The nodes are described by
NodeDetail instances that encapsulate the DOM
Node instance as well as the XPath expression
that locates the Node inside the given piece
of XML. NodeDetail also contains a "value"
that provides more information on the actual values that have
been found to be different, the concrete interpretation depends
on the type of difference as can be seen in Table 5, “Contents of NodeDetail.getValue()
for Differences”.
Table 5. Contents of NodeDetail.getValue()
for Differences
Difference.getId() | NodeDetail.getValue() |
|---|---|
HAS_DOCTYPE_DECLARATION_ID | "not null" if the document has
a DOCTYPE declaration, "null"
otherwise. |
DOCTYPE_NAME_ID | The name of the root element. |
DOCTYPE_PUBLIC_ID | The PUBLIC identifier. |
DOCTYPE_SYSTEM_ID | The SYSTEM identifier. |
NODE_TYPE_ID | If one node was absent: "not
null" if the node exists,
"null" otherwise. If the node types
differ the value will be a string-ified version of
org.w3c.dom.Node.getNodeType(). |
NAMESPACE_PREFIX_ID | The Namespace prefix. |
NAMESPACE_URI_ID | The Namespace URI. |
SCHEMA_LOCATION_ID | The attribute's value or "[attribute absent]" if it has not been specified. |
NO_NAMESPACE_SCHEMA_LOCATION_ID | The attribute's value or "[attribute absent]" if it has not been specified. |
ELEMENT_TAG_NAME_ID | The tag name with any Namespace information stripped. |
ELEMENT_NUM_ATTRIBUTES_ID | The number of attributes present turned into a
String. |
HAS_CHILD_NODES_ID | "true" if the element has
child nodes, "false"
otherwise. |
CHILD_NODELIST_LENGTH_ID | The number of child nodes present turned into a
String. |
CHILD_NODELIST_SEQUENCE_ID | The sequence number of this child node turned into a
String. |
CHILD_NODE_NOT_FOUND_ID | The name of the unmatched node or
"null". If the node is an element
inside an XML namespace the name will be
Java5-QName-like
{NS-URI}LOCAL-NAME - in all other
cases it is the node's local name. |
ATTR_SEQUENCE_ID | The attribute's name. |
ATTR_VALUE_EXPLICITLY_SPECIFIED_ID | "true" if the attribute has
been specified, "false"
otherwise. |
ATTR_NAME_NOT_FOUND_ID | The attribute's name or
"null". If the attribute belongs to
an XML namespace the name will be
Java5-QName-like
{NS-URI}LOCAL-NAME - in all other
cases it is the attribute's local name. |
ATTR_VALUE_ID | The attribute's value. |
COMMENT_VALUE_ID | The actual comment. |
PROCESSING_INSTRUCTION_TARGET_ID | The processing instruction's target. |
PROCESSING_INSTRUCTION_DATA_ID | The processing instruction's data. |
CDATA_VALUE_ID | The content of the CDATA section. |
TEXT_VALUE_ID | The actual text. |
As said in the first paragraph you won't deal with
DifferenceEngine directly in most cases. In
cases where Diff or
DetailedDiff don't provide what you need
you'd create an instance of DifferenceEngine
passing a ComparisonController in the
constructor and invoke compare with your DOM
trees to compare as well as a
DifferenceListener and
ElementQualifier. The listener will be
called on any differences while the control
method is executing.
Example 16. Using DifferenceEngine
Directly
class MyDifferenceListener implements DifferenceListener {
private boolean calledFlag = false;
public boolean called() { return calledFlag; }
public int differenceFound(Difference difference) {
calledFlag = true;
return RETURN_ACCEPT_DIFFERENCE;
}
public void skippedComparison(Node control, Node test) {
}
}
DifferenceEngine engine = new DifferenceEngine(myComparisonController);
MyDifferenceListener listener = new MyDifferenceListener();
engine.compare(controlNode, testNode, listener,
myElementQualifier);
System.err.println("There have been "
+ (listener.called() ? "" : "no ")
+ "differences.");
The ComparisonController's job is to
decide whether a comparison should be halted after a difference
has been found. Its interface is:
/**
* Determine whether a Difference that the listener has been notified of
* should halt further XML comparison. Default behaviour for a Diff
* instance is to halt if the Difference is not recoverable.
* @see Difference#isRecoverable
* @param afterDifference the last Difference passed to <code>differenceFound</code>
* @return true to halt further comparison, false otherwise
*/
boolean haltComparison(Difference afterDifference);
Whenever a difference has been detected by the
DifferenceEngine the
haltComparison method will be called
immediately after the DifferenceListener has
been informed of the difference. This is true no matter what
type of Difference has been found or which
value the DifferenceListener has
returned.
The only implementations of
ComparisonController that ship with XMLUnit
are Diff and DetailedDiff,
see Section 3.5, “Diff and
DetailedDiff” for details about them.
A ComparisonController that halted the
comparison on any non-recoverable difference could be
implemented as:
Example 17. A Simple
ComparisonController
public class HaltOnNonRecoverable implements ComparisonController {
public boolean haltComparison(Difference afterDifference) {
return !afterDifference.isRecoverable();
}
}
DifferenceListener contains two
callback methods that are invoked by the
DifferenceEngine when differences are
detected:
/**
* Receive notification that 2 nodes are different.
* @param difference a Difference instance as defined in {@link
* DifferenceConstants DifferenceConstants} describing the cause
* of the difference and containing the detail of the nodes that
* differ
* @return int one of the RETURN_... constants describing how this
* difference was interpreted
*/
int differenceFound(Difference difference);
/**
* Receive notification that a comparison between 2 nodes has been skipped
* because the node types are not comparable by the DifferenceEngine
* @param control the control node being compared
* @param test the test node being compared
* @see DifferenceEngine
*/
void skippedComparison(Node control, Node test);
differenceFound is invoked by
DifferenceEngine as soon as a difference has
been detected. The return value of that method is completely
ignored by DifferenceEngine, it becomes
important when used together with Diff,
though (see Section 3.5, “Diff and
DetailedDiff”). The return value should be
one of the four constants defined in the the
DifferenceListener interface:
/**
* Standard return value for the <code>differenceFound</code> method.
* Indicates that the <code>Difference</code> is interpreted as defined
* in {@link DifferenceConstants DifferenceConstants}.
*/
int RETURN_ACCEPT_DIFFERENCE;
/**
* Override return value for the <code>differenceFound</code> method.
* Indicates that the nodes identified as being different should be
* interpreted as being identical.
*/
int RETURN_IGNORE_DIFFERENCE_NODES_IDENTICAL;
/**
* Override return value for the <code>differenceFound</code> method.
* Indicates that the nodes identified as being different should be
* interpreted as being similar.
*/
int RETURN_IGNORE_DIFFERENCE_NODES_SIMILAR;
/**
* Override return value for the <code>differenceFound</code> method.
* Indicates that the nodes identified as being similar should be
* interpreted as being different.
*/
int RETURN_UPGRADE_DIFFERENCE_NODES_DIFFERENT = 3;
The skippedComparison method is
invoked if the DifferenceEngine encounters
two Nodes it cannot compare. Before invoking
skippedComparison
DifferenceEngine will have invoked
differenceFound with a
Difference of type
NODE_TYPE.
A custom DifferenceListener that
ignored any DOCTYPE related differences could be written
as:
Example 18. A DifferenceListener that Ignores
DOCTYPE Differences
public class IgnoreDoctype implements DifferenceListener {
private static final int[] IGNORE = new int[] {
DifferenceConstants.HAS_DOCTYPE_DECLARATION_ID,
DifferenceConstants.DOCTYPE_NAME_ID,
DifferenceConstants.DOCTYPE_PUBLIC_ID_ID,
DifferenceConstants.DOCTYPE_SYSTEM_ID_ID
};
static {
Arrays.sort(IGNORE);
}
public int differenceFound(Difference difference) {
return Arrays.binarySearch(IGNORE, difference.getId()) >= 0
? RETURN_IGNORE_DIFFERENCE_NODES_IDENTICAL
: RETURN_ACCEPT_DIFFERENCE;
}
public void skippedComparison(Node control, Node test) {
}
}
Apart from Diff and
DetailedDiff XMLUnit ships with an additional
implementation of DifferenceListener.
IgnoreTextAndAttributeValuesDifferenceListener
doesn't do anything in skippedComparison.
It "downgrades" Differences of type
ATTR_VALUE,
ATTR_VALUE_EXPLICITLY_SPECIFIED and
TEXT_VALUE to recoverable
differences.
This means if instances of
IgnoreTextAndAttributeValuesDifferenceListener
are used together with Diff then two pieces
of XML will be considered similar if they have the same basic
structure. They are not considered identical, though.
Note that the list of ignored differences doesn't cover
all textual differences. You should configure XMLUnit to
ignore comments and whitespace and to consider CDATA sections
and text nodes to be the same (see Section 3.8, “Configuration Options”) in order to cover
COMMENT_VALUE and
CDATA_VALUE as well.
When DifferenceEngine encounters a list
of DOM Elements as children of another
Element it will ask the configured
ElementQualifier which
Element of the control piece of XML should be
compared to which of the test piece. Its contract is:
/**
* Determine whether two elements are comparable
* @param control an Element from the control XML NodeList
* @param test an Element from the test XML NodeList
* @return true if the elements are comparable, false otherwise
*/
boolean qualifyForComparison(Element control, Element test);
For any given Element in the control
piece of XML DifferenceEngine will cycle
through the corresponding list of Elements in
the test piece of XML until
qualifyForComparison has returned
true or the test document is
exhausted.
When using DifferenceEngine or
Diff it is completely legal to set the
ElementQualifier to null.
In this case any kind of Node is compared to
the test Node that appears at the same
position in the sequence.
Example 19. Example Nodes for ElementQualifier
(the comments are not part of the example)
<!-- control piece of XML --> <parent> <child1/> <!-- control node 1 --> <child2/> <!-- control node 2 --> <child2 foo="bar">xyzzy</child2> <!-- control node 3 --> <child2 foo="baz"/> <!-- control node 4 --> </parent> <!-- test piece of XML --> <parent> <child2 foo="baz"/> <!-- test node 1 --> <child1/> <!-- test node 2 --> <child2>xyzzy</child2> <!-- test node 3 --> <child2 foo="bar"/> <!-- test node 4 --> </parent>
Taking Example 19, “Example Nodes for ElementQualifier
(the comments are not part of the example)” without any
ElementQualifier
DifferenceEngine will compare control node
n to test node n for
n between 1 and 4. In many cases this is
exactly what is desired, but sometimes
<a><b/><c/></a> should be similar
to <a><c/><b/></a> because the
order of elements doesn't matter - this is when you'd use a
different ElementQualifier. XMLUnit ships
with several implementations.
Only Elements with the same name -
and Namespace URI if present - qualify.
In Example 19, “Example Nodes for ElementQualifier
(the comments are not part of the example)” this means
control node 1 will be compared to test node 2. Then control
node 2 will be compared to test node 3 because
DifferenceEngine will start to search for
the matching test Element at the second
test node, the same sequence number the control node is at.
Control node 3 is compared to test node 3 as well and control
node 4 to test node 4.
Only Elements with the same name -
and Namespace URI if present - as well as the same values for
all attributes given in
ElementNameAndAttributeQualifier's
constructor qualify.
Let's say "foo" has been passed to
ElementNameAndAttributeQualifier's
constructor when looking at Example 19, “Example Nodes for ElementQualifier
(the comments are not part of the example)”. This again means control
node 1 will be compared to test node 2 since they do have the
same name and no value at all for attribute
"foo". Then control node 2 will be
compared to test node 3 - again, no value for
"foo". Control node 3 is compared to test
node 4 as they have the same value "bar".
Finally control node 4 is compared to test node 1; here
DifferenceEngine searches from the
beginning of the test node list after test node 4 didn't
match.
There are three constructors in
ElementNameAndAttributeQualifier. The
no-arg constructor creates an instance that compares all
attributes while the others will compare a single attribute or
a given subset of all attributes.
Only Elements with the same name -
and Namespace URI if present - as well as the same text
content nested into them qualify.
In Example 19, “Example Nodes for ElementQualifier
(the comments are not part of the example)” this means
control node 1 will be compared to test node 2 since they both
don't have any nested text at all. Then control node 2 will
be compared to test node 4. Control node 3 is compared to
test node 3 since they have the same nested text and control
node 4 to test node 4.
All ElementQualifiers seen so far
only looked at the Elements themselves and
not at the structure nested into them at a deeper level. A
frequent user question has been which
ElementQualifier should be used if the
pieces of XML in Example 20, “Example for
RecursiveElementNameAndTextQualifier
(the comments are not part of the example)” should be
considered similar.
Example 20. Example for
RecursiveElementNameAndTextQualifier
(the comments are not part of the example)
<!-- control -->
<table>
<tr> <!-- control row 1 -->
<td>foo</td>
</tr>
<tr> <!-- control row 2 -->
<td>bar</td>
</tr>
</table>
<!-- test -->
<table>
<tr> <!-- test row 1 -->
<td>bar</td>
</tr>
<tr> <!-- test row 2 -->
<td>foo</td>
</tr>
</table>
At first glance
ElementNameAndTextQualifier should work but
it doesn't. When DifferenceEngine
processed the children of table it would
compare control row 1 to test row 1 since both
tr elements have the same name and both
have no textual content at all.
What is needed in this case is an
ElementQualifier that looks at the element's
name, as well as the name of the first child element and the
text nested into that first child element. This is what
RecursiveElementNameAndTextQualifier
does.
RecursiveElementNameAndTextQualifier
ignores whitespace between the elements leading up to the
nested text.
MultiLevelElementNameAndTextQualifier has
in a way been the predecessor
of Section 3.4.4, “org.custommonkey.xmlunit.examples.RecursiveElementNameAndTextQualifier”.
It also matches element names and those of nested child
elements until it finds matches, but
unlike RecursiveElementNameAndTextQualifier,
you must
tell MultiLevelElementNameAndTextQualifier
at which nesting level it should expect the nested text.
MultiLevelElementNameAndTextQualifier's
constructor expects a single argument which is the nesting
level of the expected text. If you use an argument of 1,
MultiLevelElementNameAndTextQualifier is
identical to ElementNameAndTextQualifier.
In Example 20, “Example for
RecursiveElementNameAndTextQualifier
(the comments are not part of the example)” a value of 2 would be
needed.
By default
MultiLevelElementNameAndTextQualifier
will not ignore whitespace between the elements leading up
to the nested text. If your piece of XML contains this sort
of whitespace (like Example 20, “Example for
RecursiveElementNameAndTextQualifier
(the comments are not part of the example)” which
contains a newline and several space characters between
<tr> and
<td>) you can either instruct
XMLUnit to ignore whitespace completely (see
Section 3.8.1, “Whitespace Handling”) or use the two-arg
constructor of
MultiLevelElementNameAndTextQualifier
introduced with XMLUnit 1.2 and set the
ignoreEmptyTexts argument to
true.
In
general RecursiveElementNameAndTextQualifier
requires less knowledge upfront and its whitespace-handling
is more intuitive.
Diff and
DetailedDiff provide simplified access to
DifferenceEngine by implementing the
ComparisonController and
DifferenceListener interfaces themselves.
They cover the two most common use cases for comparing two
pieces of XML: checking whether the pieces are different (this
is what Diff does) and finding all
differences between them (this is what
DetailedDiff does).
DetailedDiff is a subclass of
Diff and can only be constructed by creating
a Diff instance first.
The major difference between them is their implementation
of the ComparisonController interface:
DetailedDiff will never stop the comparison
since it wants to collect all differences.
Diff in turn will halt the comparison as soon
as the first Difference is found that is not
recoverable. In addition DetailedDiff
collects all Differences in a list and
provides access to it.
By default Diff will consider two
pieces of XML as identical if no differences have been found at
all, similar if all differences that have been found have been
recoverable (see Table 1, “Document level Differences detected by
DifferenceEngine” to Table 4, “Other Differences detected by
DifferenceEngine”) and different as soon as any
non-recoverable difference has been found.
It is possible to specify a
DifferenceListener to Diff
using the overrideDifferenceListener method.
In this case each Difference will be
evaluated by the passed in
DifferenceListener. By returning
RETURN_IGNORE_DIFFERENCE_NODES_IDENTICAL the
custom listener can make Diff ignore the
difference completely. Likewise any
Difference for which the custom listener
returns
RETURN_IGNORE_DIFFERENCE_NODES_SIMILAR will
be treated as if the Difference was
recoverable.
There are several overloads of the Diff
constructor that allow you to specify your piece of XML in many
ways. There are overloads that accept additional
DifferenceEngine and
ElementQualifier arguments. Passing in a
DifferenceEngine of your own is the only way
to use a ComparisonController other than
Diff.
Note that Diff and
DetailedDiff use
ElementNameQualifier as their default
ElementQualifier. This is different from
DifferenceEngine which defaults to no
ElementQualifier at all.
To use a custom ElementQualifier you
can also use the overrideElementQualifier
method. Use this with an argument of null to
unset the default ElementQualifier as
well.
To compare two pieces of XML you'd create a
Diff instance from those two pieces and
invoke identical to check that there have
been no differences at all and similar to
check that any difference, if any, has been recoverable. If the
pieces are identical they are also similar. Likewise if they
are not similar they can't be identical either.
Example 21. Comparing Two Pieces of XML Using
Diff
Diff d = new Diff("<a><b/><c/></a>", "<a><c/><b/></a>");
assertFalse(d.identical()); // CHILD_NODELIST_SEQUENCE Difference
assertTrue(d.similar());
The result of the comparison is cached in
Diff, repeated invocations of
identical or similar will
not reevaluate the pieces of XML.
Note: calling toString on an instance
of Diff or DetailedDiff
will perform the comparision and cache its result immediately.
If you change the DifferenceListener or
ElementQualifier after calling
toString it won't have any effect.
DetailedDiff provides only a single
constructor that expects a Diff as argument.
Don't use DetailedDiff if all you need to
know is whether two pieces of XML are identical/similar - use
Diff directly since its short-cut
ComparisonController implementation will save
time in this case.
Example 22. Finding All Differences Using
DetailedDiff
Diff d = new Diff("<a><b/><c/></a>", "<a><c/><b/></a>");
DetailedDiff dd = new DetailedDiff(d);
dd.overrideElementQualifier(null);
assertFalse(dd.similar());
List l = dd.getAllDifferences();
assertEquals(2, l.size()); // expected <b/> but was <c/> and vice versa
Sometimes you might be interested in any sort of comparison result and want to get notified of successful matches as well. Maybe you want to provide feedback on the amount of differences and similarities between two documents, for example.
The interface MatchTracker can be
implemented to get notified on each and every successful match,
note that there may be a lot more comparisons going on than you
might expect and that your callback gets notified a lot.
Example 23. The MatchTracker interface
package org.custommonkey.xmlunit;
/**
* Listener for callbacks from a {@link DifferenceEngine#compare
* DifferenceEngine comparison} that is notified on each and every
* comparision that resulted in a match.
*/
public interface MatchTracker {
/**
* Receive notification that 2 match.
* @param match a Difference instance as defined in {@link
* DifferenceConstants DifferenceConstants} describing the test
* that matched and containing the detail of the nodes that have
* been compared
*/
void matchFound(Difference difference);
}
Despite its name the Difference
instance passed into the matchFound method
really describes a match and not a difference. You can expect
that the getValue method on both the
control and the test NodeDetail will be
equal.
DifferenceEngine provides a constructor
overload that allows you to pass in
a MatchTracker instance and also provides
a setMatchTracker
method. Diff
and DetailedDiff
provide overrideMatchTracker methods that
fill the same purpose.
Note that your MatchTracker won't
receive any callbacks once the
configured ComparisonController has decided
that DifferenceEngine should halt the
comparison.
XMLAssert and
XMLTestCase contain quite a few overloads of
methods for comparing two pieces of XML.
The method's names use the word Equal
to mean the same as similar in the
Diff class (or throughout this guide). So
assertXMLEqual will assert that only
recoverable differences have been encountered where
assertXMLNotEqual asserts that some
differences have been non-recoverable.
assertXMLIdentical asserts that there haven't
been any differences at all while
assertXMLNotIdentical asserts that there have
been differences (recoverable or not).
Most of the overloads of assertXMLEqual
just provide different means to specify the pieces of XML as
Strings, InputSources,
Readers[7] or Documents. For each
method there is a version that takes an additional
err argument which is used to create the
message if the assertion fails.
If you don't need any control over the
ElementQualifier or
DifferenceListener used by
Diff these methods will save some boilerplate
code. If CONTROL and TEST
are pieces of XML represented as one of the supported inputs
then
Diff d = new Diff(CONTROL, TEST);
assertTrue("expected pieces to be similar, " + d.toString(),
d.similar());
and
assertXMLEqual("expected pieces to be similar", CONTROL, TEST);
are equivalent.
If you need more control over the Diff
instance there is a version of assertXMLEqual
(and assertXMLIdentical) that accepts a
Diff instance as its argument as well as a
boolean indicating whether you expect the
Diff to be similar
(identical) or not.
XMLTestCase contains a couple of
compareXML methods that really are only
shortcuts to Diff's constructors.
There is no way to use DifferenceEngine
or DetailedDiff directly via the convenience
methods.
Unless you are using Document or
DOMSource overrides when specifying your
pieces of XML, XMLUnit will use the configured XML parsers (see
Section 2.4.1, “JAXP”) and EntityResolvers
(see Section 2.4.2, “EntityResolver”). There are configuration
options to use different settings for the control and test
pieces of XML.
In addition some of the other configuration settings may lead to XMLUnit using the configured XSLT transformer (see Section 2.4.1, “JAXP”) under the covers.
Two different configuration options affect how XMLUnit treats whitespace in comparisons:
If XMLUnit has been configured to ignore element
content whitespace it will trim any text nodes found by
the parser. This means that there won't appear to be any
textual content in element <foo>
for the following example. If you don't set
XMLUnit.setIgnoreWhitespace there would
be textual content consisting of a new line
character.
<foo> </foo>
At the same time the following two
<foo> elements will be considered
identical if the option has been enabled, though.
<foo>bar</foo> <foo> bar </foo>
When this option is set to true,
Diff will use the XSLT transformer
under the covers.
If you set
XMLUnit.setNormalizeWhitespace to true
then XMLUnit will replace any kind of whitespace found in
character content with a SPACE character and collapse
consecutive whitespace characters to a single SPACE. It
will also trim the resulting character content on both
ends.
The following two <foo>
elements will be considered identical if the option has
been set:
<foo>bar baz</foo>
<foo> bar
baz</foo>
Note that this is not related to "normalizing" the
document as a whole (see Section 3.8.2, “"Normalizing" Documents”).
"Normalize" in this context corresponds to the
normalize method in DOM's
Document class. It is the process of
merging adjacent Text nodes and is not
related to "normalizing whitespace" as described in the
previous section.
Usually you don't need to care about this option since
the XML parser is required to normalize the
Document when creating it. The only reason
you may want to change the option via
XMLUnit.setNormalize is that your
Document instances have not been created by
an XML parser but rather been put together in memory using the
DOM API directly.
Using XMLUnit.setIgnoreComments you
can make XMLUnit's difference engine ignore comments
completely.
When this option is set to true,
Diff will use the XSLT transformer under
the covers.
It is not always necessary to know whether a text has
been put into a CDATA section or not. Using
XMLUnit.setIgnoreDiffBetweenTextAndCDATA
you can make XMLUnit consider the following two pieces of XML
identical:
<foo><bar></foo>
<foo><![CDATA[<bar>]]></foo>
Normally the XML parser will expand character references
to their Unicode equivalents but for more complex entity
definitions the parser may expand them or not.
Using XMLUnit.setExpandEntityReferences you
can control the parser's setting.
When XMLUnit cannot match a control Element to a test
Element (the configured ElementQualifier - see
Section 3.4, “ElementQualifier” - doesn't return true for
any of the test Elements) it will try to compare it against
the first unmatched test Element (if there is one).
Starting with XMLUnit 1.3 one can
use XMLUnit.setCompareUnmatched to
disable this behavior and
generate CHILD_NODE_NOT_FOUND differences
instead.
If the control document is
<root> <a/> </root>
and the test document is
<root> <b/> </root>
the default setting will create a
single ELEMENT_TAG_NAME Difference
("expected a but found b").
Setting XMLUnit.setCompareUnmatched to
false will create two Differences of
type CHILD_NODE_NOT_FOUND (one for "a" and
one for "b") instead.