XML/XSD Schemavalidation of an OOXML document
Good afternoon.
I am generating programmatically OOXML-documents for routine-use. As my knowledge of OOXML bases entirely on online-resources, I make errors and would like to validate the code in my template files (document.xml, header.xml) and styles (styles.xml) against the referenced schema-definitions, which are for now only these three:
My knowledge of xmllint is insufficient and online-validators appear to validate each of my documents as valid, even where I close a container-tag before one of the elements that it must include. The only thing that they achieve is assure the “well-formedness” of the XML. Can you point me at a resource which explains how this type of document is best validated against the named schemas? Or where I can download the xsd for each schema, if I want to feed them to xmllint? Amongst others, I have seen:
|
Could you please give an example xml (or a link to it)?
|
Quote:
I do not know if that is the kind of example you wish to see. |
Ok, let's try document.xml
Code:
<?xml version="2.0" encoding="utf-8" standalone="yes"?> Code:
document.xml:1: parser error : Unsupported version '2.0' Code:
document.xml:6: namespace error : Namespace prefix r for id on headerReference is not defined Code:
<?xml version="1.0" encoding="utf-8" standalone="yes"?> PS: I found the xsd files here: https://jar-download.com/cache_jars/.../jar_files.zip |
Quote:
Are you accustomed to this kind of problem or how did you make a connection to jar-download.com? Even if XML and Java are close friends, I always hope for a generally applicable procedure and would not have thought of searching for a jar-archive, of all choices... called zip, if it must. Anyway. |
Well, I'd suggest this:
As root 1. If you don't have file /usr/local/etc/xml/catalog, create it: Code:
$ mkdir -p /usr/local/etc/xml Code:
<uri name="http://www.w3.org/XML/1998/namespace" uri="file:///usr/local/etc/xml/xml_2009_01.xsd"/> Code:
wget -O /usr/local/etc/xml/xml_2009_01.xsd http://www.w3.org/2009/01/xml.xsd 3. Put the OOXML-xsd files into a sub-directory of your work-dir, eg ooxml_xsd. 4. Some modifications are required to let xmllint work: 4.1. wml.xsd -- missing schemaLocation Code:
- <xsd:import id="xml" namespace="http://www.w3.org/XML/1998/namespace" /> Code:
- <xsd:import schemaLocation="dml-graphicalObject.xsd" namespace="http://schemas.openxmlformats.org/drawingml/2006/main" /> Code:
<?xml version="1.0" encoding="utf-8"?> Code:
$ export XML_CATALOG_FILES=/usr/local/etc/xml/catalog |
Thank you so much!
I validate. And promptly, my “templates” must be revised as I have skipped some namespaces, at least for the attributes of the <w:headerReference/>. Up to now I was lucky that the text-processor, which reads my final documents, corrects errors upon saving and my requirements were simple. It is, however, surprising that the validation needs so much preparation. |
Hi @nevemTeve
I "think" I've got everything right (I've modified your approach slightly by placing the xml_2009_01.xsd file in the same folder as the word docs for testing hopefully without the requirement for the catalog) and I'm getting the following... Code:
root@dev:/Development # xmllint --schema /Development/OfficeOpenXML-XMLSchema-Strict/wml.xsd testdoc.xml --noout --debugent ...so the schemas all seem happy, but for some reason testdoc.xml with the following w:document node is failing as above (XMLspy validates it fine) Code:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?> Any ideas would be greatly appreciated! thanks so much Ricky |
You might want to edit your post to add [code] and [/code] tags.
|
Hi,
Thanks very much for the formatting tip. I've tidied up but no longer need assistance as have resolved the issue I explain above. It was actually a PHP DOMDocument issue caused by this PHP bug. schemaValidate ignores namespaces dynamically added to a DOMDocument https://bugs.php.net/bug.php?id=78352 |
All times are GMT -5. The time now is 04:59 PM. |