Entities are the funny things that start with & and end with ; and can be custom defined within a DTD.
Normally, your XML parser should just transparently decode it and I should never have to look at the values.
However, if you check this MRSS feed:
http://feeds.theonion.com/OnionNewsNetwork?format=xmlYou can see them doing dumb things like putting encoded entities into a CDATA block:
<itunes:summary><![CDATA[Jim and Tracy welcome Chris Morgan, the kindergartener who wrote the latest action-packed "Fast And The Furious" sequel.]]></itunes:summary>
<itunes:summary><![CDATA[The rest of this year's pop culture to be "pretty rough,” The Economist lets readers catch up, and a Wal-Mart greeter knows exactly how many blacks are in the store. It's the week of April 18th, 2011.]]></itunes:summary>
This means I need to decode those bits by hand.
So, so far, I've done this:
Function ReplaceString(str As String, search as String, replace as String) As String
count = 0 ' Be terrified of infinite loops
idx = instr( 0, str, search )
while ( idx > 0 AND count < 20 )
print "before: " str
str = left( str, idx - 1 ) + replace + right( str, len( str ) - len( search ) - idx + 1 )
idx = instr( 0, str, search )
count = count + 1
end while
return str
End Function
Function decode(http As Object, s As Dynamic) As String
' Let's manually decode some XML entities since they come in from our Onion feed
if ( type(s) = "String" AND s <> invalid AND s <> "" )
s = ReplaceString( s, """, chr(34) )
s = ReplaceString( s, "”", chr(34) )
endif
return http.Unescape( validstr( s ) )
End Function
But as anyone will likely tell you, implementing things like string replacement in script rather than in native code is going to be slow slow angry slow.
Realistically, not only should you add a "decode" function, but you should also probably add a similar "replace" function that is also running in native code for performance.
Thanks for your interest!
mig