"greubel" wrote:
This is the fastest I've come up with to unescape the characters.
itm = txt.Tokenize( "&" )
txt = ""
for i=0 to itm.Count()-1
fld = itm[i].Tokenize( ";" )
if fld.Count() = 1
if Lcase(fld[0]) = "gt"
fld[0] = ">"
elseif Lcase(fld[0]) = "lt"
fld[0] = "<"
elseif Lcase(fld[0]) = "amp"
fld[0] = "&"
elseif Lcase(fld[0]) = "quot"
fld[0] = Q
else
fld[0] = "&" + itm[i]
end if
else
if Lcase(fld[0]) = "gt"
fld[0] = ">" + fld[1]
elseif Lcase(fld[0]) = "lt"
fld[0] = "<" + fld[1]
elseif Lcase(fld[0]) = "amp"
fld[0] = "&" + fld[1]
elseif Lcase(fld[0]) = "quot"
fld[0] = Q + fld[1]
else
fld[0] = fld[0] + ";" + fld[1]
end if
for j=2 to fld.Count()-1
fld[0] = fld[0] + ";" + fld[j]
end for
end if
txt = txt + fld[0]
end for
Wouldn't it be faster if instead of sequence of comparisons you use dictionary lookup (AKA "rich man's switch()")? Comes case-insensitive pre-bundled:
BrightScript Debugger> deAmp = {gt:">", lt:"<", amp:"&", quot:chr(34)}
BrightScript Debugger> ? deAmp["GT"]
>
Also, do you guys (+TheEndless) have somewhere posted sample XML against which doing benchmarks? I feel bummed about how roXML
guts texts, i am tempted to write my own parser. from what i gather, currently roXML is so-close-and-yet-completely-useless for parsing XHTML.