HTML to XML (Text Processing)


This operator converts a HTML document into an XML/XHTML document.


The HTML to XML operator takes a document in the HTML-Format and parses it into strict XHTML, removing things as non-closed stand-alone tags and so on. This can be useful, if an XHTML document is required, or it's necessary that the document is fully valid.


  • document

    The HTML-document that should be transformed.


  • document

    The XHTML-Document.

Tutorial Processes

Replace invalid HTML tags

In this example, we first generate an HTML document, which contains a lot of non-XHTML-conform Tags, like a non-closed li, non-closed stand-alone tags and <H1> instead of <h1>.

So we pass on this document into the HTML to XML operator.

When we now open the results, we'll see that the operator has replaced all invalid tags by their valid representations.