"renojim" wrote:
Now how all that chooses a character from the font I'm using, I have no idea, but then that's exactly why I wanted to test this in the first place.
Any Unicode "character" can be identified by a number, called a "code point".
The code point for the Euro sign has been assigned the decimal number of 8363 (hex 20AC).
The internal representation of a character is determined by its "encoding". One such encoding is UTF-8, which can represent any character as a sequence of one to four 8-bit bytes. The UTF-8 encoding for the Euro sign is the three-byte sequence: E2 82 AC (hex).
The external representation of a character is called a "glyph". The glyph used to represent a particular character depends on the "font" used to display the characters. A font maps code points to glyphs. The font must contain a glyph representing a character for that character to display correctly. What glyphs the font contains are up to the font designer; there's no uniform standard for that, and some fonts contain more glyphs than others.
To display a character correctly requires use of the correct encoding to read the character's internal byte representation, AND a font that contains a glyph that represents the character.
The reason your Windows 7 console displayed gibberish is because it was not aware that the character was encoded in UTF-8, i.e. it didn't know that the 3-byte sequence E282AC represented the Euro character. It displayed each byte as a separate character because that is what was called for by its default encoding.
BrightScript stores characters internally using the UTF-8 encoding. 'Asc' returns the code point number of the character's UTF-8 byte sequence (regardless of what the documentation says). 'Length' returns the number of characters, not bytes (which is correct in the documentation).