Roku Developer Program

Join our online forum to talk to Roku developers and fellow channel creators. Ask questions, share tips with the community, and find helpful resources.
cancel
Showing results for 
Show  only  | Search instead for 
Did you mean: 
RENJITHVR4
Visitor

How to parse HTML tags by using Brightscript?

From API , we have some text with HTML tags. Actually, it is privacy policy content. So is it possible to show privacy policy content without HTML tags? But we want the Right style. Like font size and weight. Is it possible to convert HTML tags to relevant format for this? Please suggest me the best way.

For example 

<ol>\r\n\t<li>We use Personal Data to allow you to participate in the features on the Site, to process your registration, and to provide you with other requested content related to our content and other offerings. Click here to learn more&nbsp;</li></ol>
0 Kudos
7 REPLIES 7
venkatareddy
Visitor

Re: How to parse HTML tags by using Brightscript?

Hi 
I am also looking for same issue, if you got any solution for this. Please give me an update. Thanks in advance, hope to get response from you.
0 Kudos
speechles
Roku Guru

Re: How to parse HTML tags by using Brightscript?

Brightscript Debugger> html = "<tag>hi there<another tag/><tag2> <TAG3>MORE</tag3>"

Brightscript Debugger> ? html
<tag>hi there<another tag/><tag2> <TAG3>MORE</tag3>

Brightscript Debugger> r = CreateObject("roRegex", "<.*?>", "") : ? r.ReplaceAll(html, "")
hi there MORE

Brightscript Debugger> html = "\r\n\tHELLO \r\r\rHOW ARE YOU?"

Brightscript Debugger> ? html
\r\n\tHELLO \r\r\rHOW ARE YOU?

Brightscript Debugger> r = CreateObject("roRegex", "(\\r|\\t|\\v|\\n)", "") : ? r.ReplaceAll(html, "")
HELLO HOW ARE YOU?


Brightscript Debugger> html = "<ol>\r\n\t<li>We use Personal Data to allow you to participate in the features on the Site, to process your registration, and to provide you with other requested content related to our content and other offerings. Click here to learn more&nbsp;</li></ol>"

' strip html tags
Brightscript Debugger> r = CreateObject("roRegex", "<.*?>", "") : html = r.ReplaceAll(html, "")

' strip carriage return, tab, vertical tab, newline
Brightscript Debugger> r = CreateObject("roRegex", "(\\r|\\t|\\v|\\n)", "") : html = r.ReplaceAll(html, "")

Brightscript Debugger> ?html
We use Personal Data to allow you to participate in the features on the Site, to process your registration, and to provide you with other requested content related to our content and other offerings. Click here to learn more&nbsp;

' strip non breaking space entity
Brightscript Debugger> r = CreateObject("roRegex", "&nbsp;", "") : ? r.ReplaceAll(html, "")
We use Personal Data to allow you to participate in the features on the Site, to process your registration, and to provide you with other requested content related to our content and other offerings. Click here to learn more
0 Kudos
RokuNB
Roku Guru

Re: How to parse HTML tags by using Brightscript?

don't use roRegEx when simple .replace() would do; the latter is faster.
roXmlElement may be of help, if the html in question is well-formed from the point of view of XML.
0 Kudos
speechles
Roku Guru

Re: How to parse HTML tags by using Brightscript?

replace doesn't do glob or grouping does it?

So would still need regex to strip the html tags and possibly the grouped \r \n \t \v. You are right though, the last part where it strips off the &nbsp; could've been replace.
0 Kudos
RokuNB
Roku Guru

Re: How to parse HTML tags by using Brightscript?

i doubt actual string would have backspace literals, that's neither here (html) nor there (c source) encoding. In Roku-speak, \r\n\t would have been chr(13)+chr(10)+chr(8)
0 Kudos
speechles
Roku Guru

Re: How to parse HTML tags by using Brightscript?

chr(8) is backspace. chr(9) = \t = horizontal tab and chr(11) = \v = vertical tab. You silly rabbit.

Brightscript Debugger> ? "no"+chr(8)+chr(8)+"yes"
yes
0 Kudos
RokuNB
Roku Guru

Re: How to parse HTML tags by using Brightscript?

"speechles" wrote:
chr(8) is backspace. chr(9) = \t = horizontal tab and chr(11) = \v = vertical tab. You silly rabbit.

i stand corrected.
0 Kudos