"quartern" wrote:
Sorry for commenting on such an old post
No sweat - the dev.forum moves at near light-speed :?: -
i mean there is a time dilation and to an external stationary observer is seems couple of years pass between e.g. a bug being reported and fixed
Without knowing what the parse is failing on how would I know what to pre-patch?
Well, the "pre-patching" idea implies that you actually
know in advance what's wrong with the HTML that makes it un-parsable. In other words can't feed it any random document from the World Wild Web. As of how to figure out where it fails, you can either
- use validator to sniff what's fishy, e.g. https://validator.w3.org/ - or
- use the fact that roXmlElement.parse() returns partial result - call genXml() on that and see which sections are missing; mutate the html, rinse & repeat...
It can be tricky but is doable. Note using a roRegEx also implies knowing the document structure.