"EnTerr" wrote:"RokuKC" wrote:
I think it is unlikely that Roku would add support for embedded chr(0) characters in BrightScript strings.
Hm, let me correct myself - i implied the only way to allow for U+0000 is by re-implementing String type with length_counter field. I was wrong - turns out there is a wickedly clever way to represent \0 in UTF-8 as 0xC0 0x80 as to avoid ever using the dreaded 0x00 octet. See Modified UTF-8.
However, if i try to do that through roByteArray, bizarre things happen:
Brightscript Debugger> ba = createObject("roByteArray"): ba.fromHexString("c080")
Brightscript Debugger> s = ba.toAsciiString(): ? len(s), s, asc(s)
1 ?? 63
What's going on here? Seems like a bug - len() is correct but the rest is whacked?
"belltown" wrote:Oh, please! Don't tell me BrightScript is "holier-than-thou Java, Android and TCL" in implementing UTF-8 standard - that would be ridiculous stance to take. :roll:
C080, or any 2-byte sequence starting with c0 or c1, is just not a valid UTF8 sequence. The official standard doesn't allow for "overlong" encodings (representing a character using a 2-byte encoding when that character can be encoded in a single byte). BrightScript is just implementing the official standard.
Regarding your earlier point: "As far as i can tell the \0 arcane quirk is not even documented." -- it's in the roByteArray description in the Component Reference."Beware of the leopard!" - you are looking in the wrong place. NUL is legitimate character in ASCII, Unicode and BASIC. Imagine you never knew C in your life. What's the expected outcome from the following?
for i = 0 to 10: ? len(chr(i)), : next
Having Chr(0) returning empty string is 'as designed'.
BrightScript is not BASIC.
The BrightScript Chr() behavior of returning empty string for 0 and other non-valid codepoints is intentional and was done with forethought.
I'm not sure why you are attributing sarcasm or non-genuineness to my attempt to provide information. 😞
BrightScript does not support embedded NUL characters in strings.