xslt - Copy only HTML from mixed xml and HTML -
we have bunch of files html pages contain additional xml elements (all prefixed our company name 'tla') provide data , structure older program rewriting.
example form:
<html > <head> <title>highly simplified example form</title> </head> <body> <tla:document xmlns:tla="http://www.tla.com"> <tla:contexts> <tla:context id="id_1" value=""></tla:context> </tla:contexts> <tla:page> <tla:question id="q_id_1"> <table> <tr> <td> <input id="input_id_1" type="text" /> </td> </tr> </table> </tla:question> </tla:page> <!-- repeat many times --> </tla:document> </body> </html>
my task write pre-processor copy html elements, complete attributes , content new file.
like this:
<html > <head> <title>highly simplified example form</title> </head> <body> <table> <tr> <td> <input id="input_id_1" type="text" /> </td> </tr> </table> <!-- repeat many times --> </body> </html>
i've taken approach of using xslt needed extract tla elements different file. far xslt have:
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/xsl/transform" xmlns:msxsl="urn:schemas-microsoft-com:xslt" exclude-result-prefixes="msxsl" xmlns:mbl="http://www.mbl.com"> <xsl:output method="xml" indent="yes"/> <xsl:strip-space elements="*" /> <xsl:template match="mbl:* | mbl:*/@* | mbl:*/text()"/> <xsl:template match="@*|node()"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> </xsl:stylesheet>
however produces following:
<html > <head> <title>highly simplified example form</title> </head> <body> </body> </html>
as can see within tla:document element excluded. needs changed in xslt html filter out tla elements?
alternatively, there simpler way go this? know virtually every browser ignore tla elements there way need using html tool or app?
specifically targeting html elements hard, if want exclude content tla namespace (but still include non-tla elements tla elements contain), should work:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/xsl/transform" xmlns:mbl="http://www.tla.com" exclude-result-prefixes="mbl"> <xsl:output method="xml" indent="yes"/> <xsl:strip-space elements="*" /> <xsl:template match="@*|node()" priority="-2"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <!-- element-only identity template prevents tla namespace declaration being copied output --> <xsl:template match="*"> <xsl:element name="{name()}"> <xsl:apply-templates select="@* | node()" /> </xsl:element> </xsl:template> <!-- pass processing on child elements of tla elements --> <xsl:template match="mbl:*"> <xsl:apply-templates select="*" /> </xsl:template> </xsl:stylesheet>
you can use instead if want exclude has any non-null namespace:
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/xsl/transform" xmlns:mbl="http://www.tla.com" exclude-result-prefixes="mbl"> <xsl:output method="xml" indent="yes"/> <xsl:strip-space elements="*" /> <xsl:template match="@*|node()" priority="-2"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <xsl:template match="*"> <xsl:element name="{name()}"> <xsl:apply-templates select="@* | node()" /> </xsl:element> </xsl:template> <xsl:template match="*[namespace-uri()]"> <xsl:apply-templates select="*" /> </xsl:template> </xsl:stylesheet>
when either run on sample input, result is:
<html> <head> <title>highly simplified example form</title> </head> <body> <table> <tr> <td> <input id="input_id_1" type="text" /> </td> </tr> </table> </body> </html>
Comments
Post a Comment