Roku Developer Program

Developers and content creators—a complete solution for growing an audience directly.
cancel
Showing results for 
Search instead for 
Did you mean: 
migmigmig
Level 7

ifString.GetEntityDecode() ?

There's functions for % escape and % unescape, and there's a function for entity encoding.

Is there anything for entity decoding? The string-level search-and-replace functionality is a little caveman.

mig
0 Kudos
17 Replies
RokuKevin
Level 9

Re: ifString.GetEntityDecode() ?

You can try unencodedString = roUrlTransfer.unescape(encodedString)

--Kevin
0 Kudos
migmigmig
Level 7

Re: ifString.GetEntityDecode() ?

I'm already sending all my strings through the unescape, but thanks.
0 Kudos
TheEndless
Level 7

Re: ifString.GetEntityDecode() ?

Can you provide an example of the kind of string you're wanting to decode?
My Channels: http://roku.permanence.com - Twitter: @TheEndlessDev
Instant Watch Browser (NetflixIWB), Aquarium Screensaver (AQUARIUM), Clever Clocks Screensaver (CLEVERCLOCKS), iTunes Podcasts (ITPC), My Channels (MYCHANNELS)
0 Kudos
migmigmig
Level 7

Re: ifString.GetEntityDecode() ?

Entities are the funny things that start with & and end with ; and can be custom defined within a DTD.

Normally, your XML parser should just transparently decode it and I should never have to look at the values.

However, if you check this MRSS feed:
http://feeds.theonion.com/OnionNewsNetwork?format=xml

You can see them doing dumb things like putting encoded entities into a CDATA block:

<itunes:summary><![CDATA[Jim and Tracy welcome Chris Morgan, the kindergartener who wrote the latest action-packed &quot;Fast And The Furious&quot; sequel.]]></itunes:summary>

<itunes:summary><![CDATA[The rest of this year's pop culture to be &quot;pretty rough,&rdquo; The Economist lets readers catch up, and a Wal-Mart greeter knows exactly how many blacks are in the store. It's the week of April 18th, 2011.]]></itunes:summary>


This means I need to decode those bits by hand.

So, so far, I've done this:

Function ReplaceString(str As String, search as String, replace as String) As String

count = 0 ' Be terrified of infinite loops
idx = instr( 0, str, search )
while ( idx > 0 AND count < 20 )
print "before: " str
str = left( str, idx - 1 ) + replace + right( str, len( str ) - len( search ) - idx + 1 )
idx = instr( 0, str, search )
count = count + 1
end while

return str

End Function

Function decode(http As Object, s As Dynamic) As String

' Let's manually decode some XML entities since they come in from our Onion feed
if ( type(s) = "String" AND s <> invalid AND s <> "" )
s = ReplaceString( s, "&quot;", chr(34) )
s = ReplaceString( s, "&rdquo;", chr(34) )
endif

return http.Unescape( validstr( s ) )

End Function



But as anyone will likely tell you, implementing things like string replacement in script rather than in native code is going to be slow slow angry slow.

Realistically, not only should you add a "decode" function, but you should also probably add a similar "replace" function that is also running in native code for performance.

Thanks for your interest!

mig
0 Kudos
TheEndless
Level 7

Re: ifString.GetEntityDecode() ?

"migmigmig" wrote:
Entities are the funny things that start with & and end with ; and can be custom defined within a DTD.

Normally, your XML parser should just transparently decode it and I should never have to look at the values.

However, if you check this MRSS feed:
http://feeds.theonion.com/OnionNewsNetwork?format=xml

You can see them doing dumb things like putting encoded entities into a CDATA block:

That's what I thought you meant, actually, but wanted to be sure. HtmlEncode and HtmlDecode functions would definitely be worthwhile additions, but in their absence, I would think a RegEx based decoder would be much more efficient and performant than that ReplaceString method.

Or maybe something a little more hacky:
Function XmlDecode(encoded As String) As String
xml = CreateObject("roXmlElement")
If xml.Parse("<encoded>" + encoded + "</encoded>") Then
Return xml.GetText()
End If
Return encoded
End Function
My Channels: http://roku.permanence.com - Twitter: @TheEndlessDev
Instant Watch Browser (NetflixIWB), Aquarium Screensaver (AQUARIUM), Clever Clocks Screensaver (CLEVERCLOCKS), iTunes Podcasts (ITPC), My Channels (MYCHANNELS)
0 Kudos
migmigmig
Level 7

Re: ifString.GetEntityDecode() ?

Well, if we were going to pull out our wishlists, I'd say we want:

1) The full functionality that js strings have (including array-joins, regex replace, splice replace, etc)
2) JSON parsing into associative arrays (since XML is so unwieldy in HTML-land, people have been migrating to JSON... which is itself unwieldy in systems like this)

Smiley Happy

I thought about pushing XML into and then pulling it back from the XMLElement object, but since maybe only 1 in 50 strings actually have an entity that I need to replace, I figured the first walk to find any entities at all would be much cheaper than building and destroying an XML object.

I will say, I really don't have any good sense for performance whatsoever in this language.
0 Kudos
katycorp
Level 7

Re: ifString.GetEntityDecode() ?

Huge Bump. HTML entity encoding/decoding would be a big help.
0 Kudos
philotas
Level 7

Re: ifString.GetEntityDecode() ?

Is HTML entity decoding available by now?
0 Kudos
EnTerr
Level 9

Re: ifString.GetEntityDecode() ?

"philotas" wrote:
Is HTML entity decoding available by now?

Can you be more specific what you are trying to do?
Most HTML you can run through the roXmlElement parser and fetch the decoded element text - incl. through the ingenious hack by @theEndless above.

The strings already have getEntityEncode() method. The opposite is a one-liner: 
PRINT (myStringExpression).replace("&quot;", """").replace("&apos;", "'").replace("&lt;", "<").replace("&gt;", ">').replace("&amp;", "&")
0 Kudos