I find it usable (nay - enjoyable) how i can drill-down with dot-operator, say how
gives me all the nested tables at that level. Even as they are in separate branches (split occurring at <tr>), the implicit ifXmlList.getNamedElements() keep filtering down the subsets because of the way it is defined. Very nice, while barely known.
Is there some trick like this i can use to pin-point the element with certain ID attribute? Other than manually twiddling through the whole tree?
function select_by_attribute(xml as Object, attrName as String, attrValue as String) as Object: res =  if xml = invalid then return res
typ = type(xml) if typ = "roXMLElement": if xml.getAttributes()[attrName] = attrValue then res.push(xml) res.append( select_by_attribute(xml.GetChildElements(), attrName, attrValue) ) else if typ = "roXMLList" or typ = "roList" or typ = "roArray": for each x in xml: res.append( select_by_attribute(x, attrName, attrValue) ) end for else if typ = "roAssociativeArray": if xml[attrName] = attrValue then res.push(xml) res.append( select_by_attribute(xml.__, attrName, attrValue) ) else: 'error condition ? typ, xml STOP end if
return res end function
When invoked, it does a DFS (depth-first search) walk over the xml tree and returns an array of all nodes where the attribute has the desired value. For example, select_by_attribute(html, "class", "image") will give me list of all html tags with class="image".
Because of the order it walks the tree, the matching nodes in the returned list are in the same order in which they appeared textually in the ML file. In other words in the order ctrl-F would have found them in browser/text editor.
It always returns roArray, even if empty or say we were looking by id, e.g. select_by_attribute(html, "id", "postingbody"). Semantically element ids are unique in html but fn does not know nor care, for generality.
It does not work with elements that belong to multiple classes (e.g. <div class="slide first"> belongs both to "slide" and to "first"). Because i don't need it - but is simple to implement
If multiple selects for different attributes/values will be done, walking the tree every time is slow. I don't need it for my purposes but it is relatively easy to re-factor the function so that a single call/walk builds an index by class names so later multiple dictionary lookups can be done by class name, returning list of all matching nodes.