String Len() and UTF8

The string function len() as per the documentation works with ASCII.
BUT, if you have a string with UTF8 character sequences in it, len() counts the utf8 sequences as one character.

x = CreateObject("roByteArray")
x[0] = 236
x[1] = 151
x[2] = 176

? "array count = " x.Count()
str = x.ToAsciiString()
? "string length = " Len(str)
array count = 3
string length = 1

This bit me for a day trying to figure out why I kept loosing a connection that is using CHUNKED encoding

Would be nice if you could do a Count() on a string !

Is there a better way than this ?

Sub Length( in as string ) as integer
a = CreateObject("roByteArray")
a.FromAsciiString( in )
return a.Count()
End Sub
Re: String Len() and UTF8

I doubt it is any consolation but here `len()` is doing the right thing: the utf8 byte sequence you gave (ec 97 b0) represents a single unicode character, U+C5F0, hence the string is 1 character long.

(And yes, i understand your plight and have no better solution for it)
Re: String Len() and UTF8

Yes, this was more of a warning to other Devs that might hit it.
Maybe this behavour of the function should be better documented in the SDK.
Re: String Len() and UTF8

As you discovered, Len returns the number of characters in the string, not bytes. I've added a note to the documentation to clarify this. Just out of curiosity, why do you need to know the number of bytes?

Re: String Len() and UTF8

I have random log content that I'm sending to a browser window using chunked encoding.
The first thing you have to send is the number of bytes, followed by the string.
I was using len() to get the byte count but it fails with UTF8 characters.
Works fine now using my Length() function.
