PHP can validate XML against three types of files: Document Type Definitions (DTDs), Schemas (.xsd), and relaxNG:
- Validating Document Based on Document Type Definitions (DTDs)
- validate() validates the document based on its DTD.
- Validating Document Based on Schemas (.xsd)
- schemaValidate(‘schema.xsd’) validates a document based on the given schema file.
- schemaValidateSource(‘schema_str’) validates a document based on a schema defined in the given string.
- Validating Document Based on relaxNG
- relaxNGValidate(‘file.rng’) performs validation on the document based on the given RNG schema.
- relaxNGValidateSource(‘relax_ng_str’) performs validation on the document based on the given RNG source. (See https://relaxng.org/)
Validating HTML (and XML) Based on its DTD
If an XML document specifies a DTD at the top, call DOMDocument::validate()
to validate it against the DTD. The validate()
automatically looks up the name of the DTD file in the XML document.
Example: Validating an HTML Page
<?php $dom = new DOMDocument; $dom->loadHTMLFile('https://brainbell.com/'); if( ! $dom->validate() ) { die ('Invalid XML document'); }
We used DOMDocument::loadHTMLFile
method which loads HTML from a file (or URL).
You can use libxml functions to show errors by column and line number, see Handling Errors While Parsing XML.
Example: Validating HTML & Parsing Errors
<?php libxml_use_internal_errors(true); $dom = new DOMDocument; $dom->loadHTMLFile('https://brainbell.com/'); $errorObj = libxml_get_errors(); if (!$dom->validate()) { foreach ( $errorObj as $error ) { switch ( $error->level ) { case LIBXML_ERR_FATAL: echo "Fata Error: "; break; case LIBXML_ERR_ERROR: echo "Error: "; break; case LIBXML_ERR_WARNING: echo "Warning: "; break; } echo $error->code .'<br>'. 'Message: ' . $error->message .'<br>'. 'Line: ' . $error->line .'<br>'. 'Column: ' . $error->column .'<br>'. 'File/URL: ' . $error->file .'<hr>'; } libxml_clear_errors(); }
Validating XML Against Schema
The schemaValidate()
method takes the name and path to the schema file as an argument while the schemaValidateSource()
method takes schema as a string. Both methods return false if the XML does not match the rules laid down in the Schema.
Example: Validating a nonmatching schema (.xsd):
<?php $dom = new DOMDocument; $dom->load('sample.xml'); if ( $dom->schemaValidate('sample.xsd') ) echo 'Validation succeeded'; else echo 'Validation failed';
The preceding code prints the following information as the XML document not matched the schema file:
Warning: DOMDocument::schemaValidateSource(): Element 'users': No matching global declaration available for the validation root. in D:\xampp\htdocs\example.php on line 35
Validation failed
Example: Validating a matching schema string with :
<?php $xml = '<?xml version="1.0"?> <quotes> <quote year="2023"> <coding>Lorem ipsum dolor...</coding> <author>Author XYZ</author> </quote> <quote year="2022"> <coding>Lorem ipsum dolor...</coding> <author>Author ABC</author> </quote> </quotes>'; $xsd = '<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="quotes"> <xsd:complexType> <xsd:sequence> <xsd:element name="quote" type="quoteType" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:complexType name="quoteType"> <xsd:sequence> <xsd:element name="coding" type="xsd:string"/> <xsd:element name="author" type="xsd:string"/> </xsd:sequence> <xsd:attribute name="year" type="xsd:gYear" use="required"/> </xsd:complexType> </xsd:schema>'; $dom = new DOMDocument; $dom->loadXML($xml); if ( $dom->schemaValidateSource($xsd) ) echo 'Validation succeeded'; else echo 'Validation failed'; // Validation succeeded
Sample XSD File:
<?xml version="1.0"?> <xsd:schema xmlns:xsd="http://www.w3.org/2001/XMLSchema"> <xsd:element name="quotes"> <xsd:complexType> <xsd:sequence> <xsd:element name="quote" type="quoteType" minOccurs="0" maxOccurs="unbounded"/> </xsd:sequence> </xsd:complexType> </xsd:element> <xsd:complexType name="quoteType"> <xsd:sequence> <xsd:element name="coding" type="xsd:string"/> <xsd:element name="author" type="xsd:string"/> </xsd:sequence> <xsd:attribute name="year" type="xsd:gYear" use="required"/> </xsd:complexType> </xsd:schema>
See A Complete XML Schema Example.
Validating XML Against relaxNG
The following code uses relaxNGValidate()
to validate a (well-formed) XML file against a nonmatching relaxNG file.
<?php $dom = new DOMDocument; $dom->load('sample.xml'); if ( $dom->relaxNGValidate('sample.rng') ) echo 'Validation succeeded'; else echo 'Validation failed'; /* Warning: DOMDocument::relaxNGValidate(): Expecting element quotes, got users in D:\xampp\htdocs\example.php on line 4 Validation failed */
Creating a relaxNG file can be quite difficult; the Java tool Trang, available at https://relaxng.org/jclark/trang.html
, can read an XML file and create a relaxNG, Schema, or DTD file out of it.
Example: Validating a nonmatching relaxNG string with relaxNGValidateSource:
<?php $xml = '<?xml version="1.0"?> <users> <user> <name>BrainBell.com</name> <email>admin@brainbell.com</email> </user> <user> <name>Fast-Tutorials.com</name> <email>admin-fast-tutrials@outlook.com</email> </user> </users>'; $rng = '<?xml version="1.0" encoding="UTF-8"?> <element name="quotes" xmlns="http://relaxng.org/ns/structure/1.0"> <zeroOrMore> <element name="quote"> <optional> <attribute name="year"/> </optional> <element name="coding"> <text/> </element> <element name="author"> <text/> </element> </element> </zeroOrMore> </element>'; $dom = new DOMDocument; $dom->loadXML($xml); if ( $dom->relaxNGValidateSource($rng) ) echo 'Validation succeeded'; else echo 'Validation failed';
The preceding code prints “Validation failed” message and may show a warning message (depending on your error reporting settings):
Warning: DOMDocument::relaxNGValidateSource(): Expecting element quotes, got users in D:\xampp\htdocs\example.php on line 33
Validation failed
Example: Validating a matching relaxNG string with relaxNGValidateSource:
<?php $xml = '<?xml version="1.0"?> <quotes> <quote year="2023"> <coding>Lorem ipsum dolor...</coding> <author>Author XYZ</author> </quote> <quote year="2022"> <coding>Lorem ipsum dolor...</coding> <author>Author ABC</author> </quote> </quotes>'; $rng = '<?xml version="1.0" encoding="UTF-8"?> <element name="quotes" xmlns="http://relaxng.org/ns/structure/1.0"> <zeroOrMore> <element name="quote"> <optional> <attribute name="year"/> </optional> <element name="coding"> <text/> </element> <element name="author"> <text/> </element> </element> </zeroOrMore> </element>'; $dom = new DOMDocument; $dom->loadXML($xml); if ( $dom->relaxNGValidateSource($rng) ) echo 'Validation succeeded'; else echo 'Validation failed';
The preceding code prints “Validation succeeded”.
A sample.rng file:
<?xml version="1.0" encoding="UTF-8"?> <element name="quotes" xmlns="http://relaxng.org/ns/structure/1.0"> <zeroOrMore> <element name="quote"> <optional> <attribute name="year"/> </optional> <element name="coding"> <text/> </element> <element name="author"> <text/> </element> </element> </zeroOrMore> </element>
Visit https://php.net/manual/domdocument.relaxngvalidate.php.
Using XML: