Roku Developer Program

Developers and content creators—a complete solution for growing an audience directly.
cancel
Showing results for 
Search instead for 
Did you mean: 
philotas
Level 7

Re: ifString.GetEntityDecode() ?

"EnTerr" wrote:
The strings already have getEntityEncode() method. The opposite is a one-liner: 
PRINT (myStringExpression).replace("&quot;", """").replace("&apos;", "'").replace("&lt;", "<").replace("&gt;", ">').replace("&amp;", "&")


Correct. This is what I want to do, but without having to create this one-liner by hand for all entities out there, since there are more than the one in your example.

Update: I found that the above mentioned XmlDecode Function indeed work but it a quick test I did showed, that it does not work with entity names:
For example &#8364; works but &euro; does not
0 Kudos
EnTerr
Level 9

Re: ifString.GetEntityDecode() ?

"philotas" wrote:
Update: I found that the above mentioned XmlDecode Function indeed work but it a quick test I did showed, that it does not work with entity names: For example &#8364; works but &euro; does not

That must be because XML has only 5 predefined char entites - where HTML has couple of hundreds of them.

Again - can you explain your use case?
How and why are you getting text encoded as HTML?
Why not receiving data in JSON or XML (or plain-old text in UTF8 for that matter).
I could sketch you a function to tackle that, were i persuaded it's necessary.
0 Kudos
philotas
Level 7

Re: ifString.GetEntityDecode() ?

I receive (simple) HTML formatted text from a server via JSON and want to display it in  Label.
I strip out some tags, but there could be some HTML Entities which I want to convert.
0 Kudos
EnTerr
Level 9

Re: ifString.GetEntityDecode() ?

"philotas" wrote:
I receive (simple) HTML formatted text from a server via JSON and want to display it in  Label.
I strip out some tags, but there could be some HTML Entities which I want to convert.

Do you have the leeway to change that, as in simply not sending HTML?
That will make your life notably easier.

The reason being, you don't need HTML on Roku's side (not rendered) - nor do you need it for the transport, since JSON can transport all Unicode chars already, just make sure encoding for http is utf8 (should be the default already). The only concern remaining is that of having a font with the exotic glyphs, which has to be tackled either way.
0 Kudos
philotas
Level 7

Re: ifString.GetEntityDecode() ?

"EnTerr" wrote:
"philotas" wrote:
I receive (simple) HTML formatted text from a server via JSON and want to display it in  Label.
I strip out some tags, but there could be some HTML Entities which I want to convert.

Do you have the leeway to change that, as in simply not sending HTML?

Unfortunately not for the time being.
0 Kudos
EnTerr
Level 9

Re: ifString.GetEntityDecode() ?

I have to say, the part where parsing unknown character-entity-references fails miserably - i have no respect for that. Even if that's mandated by a spec, the pragmatic thing would have been to leave unrecognized items alone:
Brightscript Debugger> xml = CreateObject("roXmlElement")
Brightscript Debugger> ? xml.parse("<x>&#8364; and &euro;</x>"), xml.getText()
false          

Brightscript Debugger> ? xml.parse("<x>just &#8364;</x>"), xml.getText()
true            just €


Here is a $64k question: does roXmlElement.parse() support DTDs? Because if it does... piece of cake, we can just do something like this and the entities will get defined and handled:

<?xml version="1.0" encoding="utf-8"?>
<!DOCTYPE x PUBLIC "-//W3C//ENTITIES Special for XHTML//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml-special.ent">
<x> ...
0 Kudos
EnTerr
Level 9

Re: ifString.GetEntityDecode() ?

Affirmative on DTD support!
Brightscript Debugger> dtd2 = "<!DOCTYPE x [ <!ENTITY euro ""&#8364;""> ]>"
Brightscript Debugger> ? xml.parse(dtd2 + "<x>mwa&euro;hahaha</x>"), xml.getText()
true            mwa€hahaha

[spoiler=victory is mine:sibykd31][youtube:sibykd31]x3rWRLOZX6U[/youtube:sibykd31][/spoiler:sibykd31]
0 Kudos
belltown
Level 7

Re: ifString.GetEntityDecode() ?

"philotas" wrote:
I receive (simple) HTML formatted text from a server via JSON and want to display it in  Label.
I strip out some tags, but there could be some HTML Entities which I want to convert.

Even if you are able to convert html-formatted JSON text to display in a Label, bear in mind that you may still need to convert some characters to characters that your Roku font can handle. This applies regardless of whether the character is represented as an entity reference or even as a raw Unicode codepoint.

For instance, while the Euro character (8364) may render correctly, a character such as a hyphen (8208) will not, so you'd have to do something like:

text = text.Replace(Chr(8208), "-")
https://github.com/belltown/
0 Kudos