Inconsistent ordering bug
Ordering of the final result is inconsistent. This is just a bug. We need to standardize on one or the other.
doc = REXML::Document.new('<a><b><c><d id="1"><d id="2"/></d></c></b></a>')
REXML::XPath.first(doc, '//d/ancestor::*').name
#=> "a" (document order)
REXML::XPath.first(doc, '//d[@id="2"]/ancestor::*').name
#=> "d" (reverse document order)
REXML::XPath.match(doc, '//d/ancestor::*').map(&:name)
#=> ["a", "b", "c", "d"] (document order)
REXML::XPath.match(doc, '//d[@id="2"]/ancestor::*').map(&:name)
#=> ["d", "c", "b", "a"] (reverse document order)
REXML::XPath.first(doc, '//d/ancestor-or-self::*').name
#=> "a" (document order)
REXML::XPath.first(doc, '//d[@id="2"]/ancestor-or-self::*').name
#=> "d" (reverse-document order)
REXML::XPath.match(doc, '//d/ancestor-or-self::*').map(&:name)
#=> ["a", "b", "c", "d", "d"] (document order)
REXML::XPath.match(doc, '//d[@id="2"]/ancestor-or-self::*').map(&:name)
#=> ["d", "d", "c", "b", "a"] (reverse document order)
XPath 1.0 requires document order
https://www.w3.org/TR/1999/REC-xpath-19991116/#node-sets
Nodeset are basically unordered
node-set (an unordered collection of nodes without duplicates)
Axis has order
An axis is either a forward axis or a reverse axis
The proximity position of a member of a node-set with respect to an axis is defined to be the position of the node in the node-set ordered in document order if the axis is a forward axis and ordered in reverse document order if the axis is a reverse axis.
Positional predicates applied to axis uses axis order
The meaning of a Predicate depends crucially on which axis applies.
For example, preceding::foo[1] returns the first foo element in reverse document order, because the axis that applies to the [1] predicate is the preceding axis;
After applying predicates to reverse axis, it is unordered, and may need to be sorted in document order before applying another predicates
by contrast, (preceding::foo)[1] returns the first foo element in document order, because the axis that applies to the [1] predicate is the child axis.
Predicates are used to filter expressions in the same way that they are used in location paths. It is an error if the expression to be filtered does not evaluate to a node-set. The Predicate filters the node-set with respect to the child axis.
Functions such as local-name(node-set?), namespace-uri(node-set?), name(node-set?) requires document order
The local-name function returns the local part of the expanded-name of the node in the argument node-set that is first in document order
Order of the final result
I think it's not specified, just an unordered nodeset.
Although in JavaScript's API, theres only ANY_UNORDERED or ORDERED(means document ordered)
XPathResult.FIRST_ORDERED_NODE_TYPE
XPathResult.ANY_UNORDERED_NODE_TYPE
XPathResult.ORDERED_NODE_ITERATOR_TYPE
XPathResult.UNORDERED_NODE_ITERATOR_TYPE
XPathResult.ORDERED_NODE_SNAPSHOT_TYPE
XPathResult.UNORDERED_NODE_SNAPSHOT_TYPE
Here's an unverified LLM response to other language/libraries xpath ordering question
Q: Is This Ordering Common Across Languages?
Yes, it is nearly universal. While the XPath 1.0 engine treats the internal mathematical result as an unordered set, almost every modern programming language wraps that engine in an API that defaults to normal document order.
Here is how the most common languages and libraries handle it:
| Language / Library |
Default Output Behavior |
Explanation |
| Python (lxml / ElementTree) |
Document Order |
Python lists retain the exact tree-traversal order from the underlying C library (libxml2). |
| Java (javax.xml.xpath) |
Document Order |
Evaluates to a NodeList. The underlying implementation (usually Apache Xalan) populates this list in document order. |
| C# / .NET (System.Xml.XPath) |
Document Order |
The XPathNodeIterator moves through the selected nodes strictly in document order. |
| JavaScript / Node.js (xpath npm package) |
Document Order |
Mimics the browser DOM behavior or returns a standard array sorted by document position. |
| Selenium / Playwright |
Document Order |
Because they query the browser directly using the DOM's XPathResult, they inherit the browser's document order sorting. |
Suggestion
Always sort nodeset in document order. We don't need to sort in reverse-document order.
Of course positional predicates on axis should be still calculated in axis order.
XPath.match always return document-ordered result.
XPath.first always return first node in document-ordered nodeset.
This will fix inconsistency described above, but also changes the XPath.match order and XPath.find result of preceding::* and preceding-sibling::*.
Inconsistent ordering bug
Ordering of the final result is inconsistent. This is just a bug. We need to standardize on one or the other.
XPath 1.0 requires document order
https://www.w3.org/TR/1999/REC-xpath-19991116/#node-sets
Nodeset are basically unordered
Axis has order
Positional predicates applied to axis uses axis order
After applying predicates to reverse axis, it is unordered, and may need to be sorted in document order before applying another predicates
Functions such as
local-name(node-set?),namespace-uri(node-set?),name(node-set?)requires document orderOrder of the final result
I think it's not specified, just an unordered nodeset.
Although in JavaScript's API, theres only ANY_UNORDERED or ORDERED(means document ordered)
Here's an unverified LLM response to other language/libraries xpath ordering question
Q: Is This Ordering Common Across Languages?
Yes, it is nearly universal. While the XPath 1.0 engine treats the internal mathematical result as an unordered set, almost every modern programming language wraps that engine in an API that defaults to normal document order.
Here is how the most common languages and libraries handle it:
Suggestion
Always sort nodeset in document order. We don't need to sort in reverse-document order.
Of course positional predicates on axis should be still calculated in axis order.
XPath.matchalways return document-ordered result.XPath.firstalways return first node in document-ordered nodeset.This will fix inconsistency described above, but also changes the
XPath.matchorder andXPath.findresult ofpreceding::*andpreceding-sibling::*.