Any limitation with Saxon-EE XSLT v3 Streaming?

Any limitation with Saxon-EE XSLT v3 Streaming?



I want to apply different tansformations to a big XML document using the Saxon XSLT3 streaming capabilities. The problem that I'm facing is that, if I apply this transformation it does not work:


<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
exclude-result-prefixes="ano contextutil" xmlns:ano="java:StreamingGenericProcessor"
xmlns:contextutil="java:GenericAnonymizerContextUtil">
<xsl:mode streamable="yes"/>
<xsl:output method="xml"/>
<xsl:param name="context" as="class:java.lang.Object" xmlns:class="http://saxon.sf.net/java-type"/>
<xsl:template match="internal/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="email/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="address/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="birthday/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="country/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="external/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="name/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="phone/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="city/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="id/text()"><xsl:value-of select="ano:uuid($context, current(), 'ID')"/></xsl:template>
<xsl:template match="." >
<xsl:copy validation="preserve">
<xsl:apply-templates select="@*" />
<xsl:apply-templates select="node()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>



But with this one it does:


<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
exclude-result-prefixes="ano contextutil" xmlns:ano="java:StreamingGenericProcessor"
xmlns:contextutil="java:GenericAnonymizerContextUtil">
<xsl:mode streamable="yes"/>
<xsl:output method="xml"/>
<xsl:param name="context" as="class:java.lang.Object" xmlns:class="http://saxon.sf.net/java-type"/>
<xsl:template match="email/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="address/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="birthday/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="country/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="external/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="name/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="phone/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="city/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="id/text()"><xsl:value-of select="ano:uuid($context, current(), 'ID')"/></xsl:template>
<xsl:template match="." >
<xsl:copy validation="preserve">
<xsl:apply-templates select="@*" />
<xsl:apply-templates select="node()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>



I tested plenty of different scenarios and I concluded that if I have more than 9 "xsl:template" it does not work!



EDIT: it does not work means: on a specific tag named "id" I'm applying a java function. If I have more than 9 "xsl:template", the output is not modified and my java function is not called at all. I have no error message



EDIT2: If I replace the call to the java function with, for instance, "concat(current(), '_ID')", I have the same behaviour so this is not specific to the java function all.



EDIT3:



Here is a sample input data:


<?xml version="1.0" encoding="UTF-8"?>
<table>
<row>
<id>10</id>
<email>fake@fake.com</email>
<address>dsffe</address>
<birthday>10/2018</birthday>
<country>FR</country>
<external>zz</external>
<internal>ww</internal>
<name>Jean</name>
<phone>000000</phone>
<city>Dfegd</city>
</row>
<row>
<id>9</id>
<email>fake@fake2.com</email>
<address>sdfzefzef</address>
<birthday>11/2012</birthday>
<country>GB</country>
<external>xx</external>
<internal>yy</internal>
<name>Jean-Claude</name>
<phone>000000</phone>
<city>dd</city>
</row>



This xsl which always works:


<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:mode streamable="yes"/>
<xsl:output method="xml"/>
<xsl:template match="email/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="address/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="birthday/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="country/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="external/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="name/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="phone/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="city/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="id/text()"><xsl:value-of select="concat(current(), '_ID')"/></xsl:template>
<xsl:template match="." >
<xsl:copy validation="preserve">
<xsl:apply-templates select="@*" />
<xsl:apply-templates select="node()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>



The problematic one (the same xsl with one more template):


<xsl:stylesheet version="3.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" >
<xsl:mode streamable="yes"/>
<xsl:output method="xml"/>
<xsl:template match="email/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="address/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="birthday/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="country/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="external/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="internal/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="name/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="phone/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="city/text()"><xsl:value-of select="current()"/></xsl:template>
<xsl:template match="id/text()"><xsl:value-of select="concat(current(), '_ID')"/></xsl:template>
<xsl:template match="." >
<xsl:copy validation="preserve">
<xsl:apply-templates select="@*" />
<xsl:apply-templates select="node()" />
</xsl:copy>
</xsl:template>
</xsl:stylesheet>



I run with the following command line:


java -cp Saxon-EE-9.8.0-14.jar net.sf.saxon.Transform -s:test.xml -xsl:concat_not_working.xsl



The working XSL properly append _ID to the output id tag value whereas the
not working xsl does not do any transformation.



Another information, if I run without the license (so without streaming), both stylesheets work!



I'm using Saxon-EE 9.8.0-14 with a trial license: could it be a non documented trial license limitation ?






So in which way does the transformation not work, what is the input, which result do you expect, which one do you get? If you get any error messages please cite the exact error.

– Martin Honnen
Sep 10 '18 at 13:49






Sorry, when I say, it does not work I mean that the output is not modified. As you can see, on a specific tag named "id" I'm applying a java function. If I have more than 9 "xsl:template", the output is not modified and my java function is not called at all. I have no error message. (I just edited my question thanks)

– Jerome
Sep 10 '18 at 14:00







Can you add a minimal but complete input document and the output you want and the one you get to demonstrate the issue? Also consider explaining how you run Saxon EE (command line, Java (show the code)) exactly.

– Martin Honnen
Sep 10 '18 at 14:42






It seems like a bug indeed, I am sure @MichaelKay will give you some insight, a simple sample with <root><item><id>1</id></item></root> is transformed by Saxon EE (only have 9.8.0.12 here) to <root><item><id>1</id></item></root> while Exselt gives <root><item><id>1_ID</id></item></root>.

– Martin Honnen
Sep 10 '18 at 15:05


<root><item><id>1</id></item></root>


<root><item><id>1</id></item></root>


<root><item><id>1_ID</id></item></root>






Do you need all those copying <xsl:template match="email/text()"><xsl:value-of select="current()"/></xsl:template> templates? If you just use <xsl:mode streamable="yes" on-no-match="shallow-copy"/> you can focus on the templates that change something and don't have to spell out any copying, perhaps that allows you to work around the bug.

– Martin Honnen
Sep 10 '18 at 15:09


<xsl:template match="email/text()"><xsl:value-of select="current()"/></xsl:template>


<xsl:mode streamable="yes" on-no-match="shallow-copy"/>




1 Answer
1



Your theory that the failure occurs with 10 or more rules turns out to be spot on. When there are more than 10 rules matching the same node-kind/node-name combination (in this case, all text nodes), Saxon-EE attempts to avoid a linear search of all the rules by looking for criteria that subsets of the rules share in common. In this case it is looking to see whether it can group the rules according to a precondition based on the parent of the text node.



At this stage there is a flaw in the logic; it carefully works out that each rule is in a group of 1 (no two parent conditions are the same), which should mean that it then abandons the optimization attempt. But it doesn't abandon it; it carries on. This shouldn't matter, because the optimization should work correctly even though it was pointless.



The reason the optimization isn't working correctly is because on the streaming path for xsl:apply-templates, the context data for evaluating the rule preconditions isn't being initialized properly, leading the rule matcher to think that the preconditions aren't satisfied.



So you've hit a bug that, as you surmised, applies when you have a set of 10 or more template rules in a streaming mode when the rules all match nodes that have the same node-kind and node-name.



Running unlicensed bypasses the bug for two reasons: it deactivates the optimization of rule chains, and it deactivates streaming.



As a workaround, simply remove the /text() from each of your template rules.


/text()



Logged as a bug here: https://saxonica.plan.io/issues/3901



Unless you indicate otherwise, I will submit a new test case based on your test data and stylesheet to the W3C test suite for XSLT 3.0.






Thanks @MichaelKay for the answer and the fix. Removing /text() would induce a lot of modification in our software. I saw that you already fixed the issue, would it be possible to get access to a compiled version with the fix ? When do you plan to make a new version? Just another question, would the match be quicker if we remove the /text() part or not ?

– Jerome
Sep 11 '18 at 7:29



/text()


/text()






Template rules that match elements by name, especially when each element name is only matched by one rule, will always be fastest for matching purposes, though whether it makes a difference to the bottom line of your application is anyone's guess. Saxon maintenance releases typically come out every 2 months or so.

– Michael Kay
Sep 11 '18 at 7:39






Meanwhile you can disable the optimisation of template rule sets using -opt:-r on the command line, or by setting the configuration property FeatureKeys.OPTIMIZATION_LEVEL to "-r" from the Java API.

– Michael Kay
Sep 11 '18 at 7:45


-opt:-r


FeatureKeys.OPTIMIZATION_LEVEL



Thanks for contributing an answer to Stack Overflow!



But avoid



To learn more, see our tips on writing great answers.



Required, but never shown



Required, but never shown




By clicking "Post Your Answer", you acknowledge that you have read our updated terms of service, privacy policy and cookie policy, and that your continued use of the website is subject to these policies.

Popular posts from this blog

𛂒𛀶,𛀽𛀑𛂀𛃧𛂓𛀙𛃆𛃑𛃷𛂟𛁡𛀢𛀟𛁤𛂽𛁕𛁪𛂟𛂯,𛁞𛂧𛀴𛁄𛁠𛁼𛂿𛀤 𛂘,𛁺𛂾𛃭𛃭𛃵𛀺,𛂣𛃍𛂖𛃶 𛀸𛃀𛂖𛁶𛁏𛁚 𛂢𛂞 𛁰𛂆𛀔,𛁸𛀽𛁓𛃋𛂇𛃧𛀧𛃣𛂐𛃇,𛂂𛃻𛃲𛁬𛃞𛀧𛃃𛀅 𛂭𛁠𛁡𛃇𛀷𛃓𛁥,𛁙𛁘𛁞𛃸𛁸𛃣𛁜,𛂛,𛃿,𛁯𛂘𛂌𛃛𛁱𛃌𛂈𛂇 𛁊𛃲,𛀕𛃴𛀜 𛀶𛂆𛀶𛃟𛂉𛀣,𛂐𛁞𛁾 𛁷𛂑𛁳𛂯𛀬𛃅,𛃶𛁼

Edmonton

Crossroads (UK TV series)