XML Schema Snippet Tester
This short Ruby script is an attempt to reduce the pain of working in XML Schema. The idea is to carve out individual snippets and hammer on them in isolation1. It also makes it easy to verify XML that should be flagged as invalid doesn't sneak past2.
There's no need to use it with simple schemas. It's when working on complicated bits (e.g. trying to build a crazy restriction scheme for an attribute) that it's most useful.
Here's the script:
#!/usr/bin/env ruby
require "minitest"
require "minitest/rg"
require "nokogiri"
class Validator
attr_reader :errors, :xml, :xsd
def load_schema path
@xsd = Nokogiri::XML::Schema(File.read(path).strip)
end
def load_xml string
@xml = Nokogiri::XML(string)
@errors = xsd.validate(xml).to_a
end
def is_valid
errors.each { |error| puts error } # Debugging output
errors.empty?
end
end
class ValidatorTest < MiniTest::Test
attr_reader :v
def setup
@v = Validator.new
v.load_schema('schema.xsd')
end
def test_valid_sample
v.load_xml('<testNode key1="a"/>')
assert v.is_valid
end
def test_sample_with_invalid_attribute
v.load_xml('<testNode key1="bad_value"/>')
assert_match(
/The value 'bad_value' is not an element of the set/,
v.errors[0].to_s
)
end
end
MiniTest.run
And a few notes about it:
-
The
Validator
class provides the main functionality. It doesn't need to change unless new functionality is desired. -
ValidatorTest
is what gets edited and where tests are defined. -
v.load_schema
expects a path3 to the schema detailing the snippet to test. In this case, it points to aschema.xsd
file in the same directory. A copy of the schema used for this example is further below. -
Individual tests are defined using MiniTest's standard
test_
prefix methods. -
Individual XML snippets to test are sent via
v.load_xml
4. -
The
assert v.is_valid
call is used for examples that should pass. -
The
errors.each { |error| puts error }
inis_valid
provides debugging output when working on changes that cause a failure. Capturing them makes it easy to pull the messages to use when looking for expected failure cases. -
Validation errors are stored in an
errors
array. Usingassert_match
with part of the validation error string ensures errors occur where expected. -
The full error string in the test for invalid data is
Element 'testNode', attribute 'key1': [facet 'enumeration'] The value 'bad_value' is not an element of the set {'a', 'b'}.
To help decouple the test a little, it only matches against theThe value 'bad_value' is not an element of the set
.
Here's the example schema.xsd
:
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified">
<xs:element name="testNode" type="testNode"/>
<xs:complexType name="testNode">
<xs:attribute name="key1" type="_value_list" use="required"/>
</xs:complexType>
<xs:simpleType name="_value_list">
<xs:restriction base="xs:string">
<xs:enumeration value="a"/>
<xs:enumeration value="b"/>
</xs:restriction>
</xs:simpleType>
</xs:schema>
While there's not a lot to the setup, it's greatly reduced the amount of time I spend banging my head against the XML Schema wall.
Footnotes
-
Of course, the script works fine with full schemas and XML documents too.
-
Making sure data that should be invalid doesn't pass was the biggest driver for making this script. I use Oxygen XML Editor. It makes verifying valid files easy but doesn't appear to have a good way to check for false negatives (i.e. data you expect to fail that ends up passing validation).
-
Keeping the actual schema snippet in it's own file is my preferred way to work. Of course, it's possible to modify the script to include the schema directly as a string.
-
As with the schemas, it's possible to modify the script to use actual XML files instead. It's generally adds more overhead than it's worth.