Roku Community

marcelo_cabral · ‎04-21-2014

Finally!! I just tested and the Firmware 5.4 (build 3340) on my Roku 3, and it fixed the need of converting the encoding of the subtitles files (srt) from ANSI to UTF-8!

Thanks guys!

marcelo_cabral · ‎04-21-2014

"marcelo.cabral" wrote:
Finally!! I just tested and the Firmware 5.4 (build 3340) on my Roku 3, and it fixed the need of converting the encoding of the subtitles files (srt) from ANSI to UTF-8!

Thanks guys!

Well too early to celebrate, the issue when the last line of the srt file is missing the line-break is still there. The subtitle do not show at all, if the last line of the file do not have CHR(10)

EnTerr · ‎04-21-2014

"marcelo.cabral" wrote:
Finally!! I just tested and the Firmware 5.4 (build 3340) on my Roku 3, and it fixed the need of converting the encoding of the subtitles files (srt) from ANSI to UTF-8!

I gather you are talking about this viewtopic.php?f=28&t=66337

I am a bit suspicious - doesn't this "fix" mean that UTF-8 subtitles will stop working? I mean if it assumes some (which?) 1-byte encoding for SRT, what happens to non-English subtitles?

By the way, talking about "ANSI encoding" is a misnomer - there is no ANSI standard on accented characters etc - see here https://en.wikipedia.org/wiki/ANSI_codepage . There is ISO-8859 but with 15 distinct variants of it, see this table. If RokuCo picks one encoding like say Windows-1252 (ISO-8859-15), that won't work with non-West European languages. I.e. North, South, Central and East European subtitles are out. IMHO Roku should heavily lean on UTF-8 side in assuming SRT encoding.

TheEndless · ‎04-21-2014

I suspect they expanded support, in general, in 5.4 to include additional encodings, as well as unicode (per the observation with the stick here: viewtopic.php?f=28&t=68751&p=436790#p436790).

My Channels: http://roku.permanence.com - Twitter: @TheEndlessDev
Instant Watch Browser (NetflixIWB), Aquarium Screensaver (AQUARIUM), Clever Clocks Screensaver (CLEVERCLOCKS), iTunes Podcasts (ITPC), My Channels (MYCHANNELS)

marcelo_cabral · ‎04-21-2014

The UTF-8 files still work.

I'm Brazilian, every subtitle I downloaded in portuguese, I had to open in Notepad++ and convert the encoding to UTF-8, now it works both ways.

EnTerr · ‎04-21-2014

"marcelo.cabral" wrote:
The UTF-8 files still work.
I'm Brazilian, every subtitle I downloaded in portuguese, I had to open in Notepad++ and convert the encoding to UTF-8, now it works both ways.

Thanks Marcelo -
since Portuguese works, that lets me figure they went with ISO-8859-15 (or the old one 8859-1).

But does it work if you choose "Encode in UTF-8 without BOM" (apparently Notepad++ has two options, with and without BOM)?

marcelo_cabral · ‎04-22-2014

Hi Enterr,

I never had tried the "without BOM" option, but it works, I just converted a srt file from ANSI to UTF-8 Without BOM and my Roku 3 could show the accented characters with no problem.

EnTerr · ‎04-22-2014

"TheEndless" wrote:
I suspect they expanded support, in general, in 5.4 to include additional encodings, as well as unicode (per the observation with the stick here

Yeah, the thing is - it is tricky.
First there is no "Unicode encoding" - rather, in Unicode each character is represented by a "code point", a number in the range 0 - 0x10FFFF. That means we will need at least 21 bits per character, which is unwieldy (does not fit neatly in octets) - so instead Unicode text is represented in some kind of encoding - like UTF-8, which for all practical purposes is THE way to store/transfer unicode texts. Each unicode character gets stored in 1, 2, 3 or 4 bytes - where ASCII characters \0 - \x7F stay exactly the same 1 byte in UTF-8 and all the other chars are sequence of 2 (or more) bytes >= \x80. Consequently, if i give you a file where all chars are < \x80, it is both in UTF-8 and ASCII/ANSI encoding. If there are characters > \x7F though, it may be either UTF-8 or 1-byte ISO-8859-X (where X and hence the language is unknown)... or something else.
So i am curious how they try to guess the type of the SRT. There is one cheap way - more of a hack, really - of looking if the file starts with \xEF \xBB \xBF (BOM) and if so - it is UTF8. The problem is if it does not start with a BOM, it can still be either UTF8 or ISO/ANSI.

EnTerr · ‎04-22-2014

"marcelo.cabral" wrote:
I never had tried the "without BOM" option, but it works, I just converted a srt file from ANSI to UTF-8 Without BOM and my Roku 3 could show the accented characters with no problem.

Oh great - thank you for trying that for me.
So i guess they did it "the right way", by using heuristic on the file content. Good job, Roku - i am impressed.

planck · ‎05-13-2014

I still cannot see Greek subtitles, not even when i convert them to UTF-8.

Roku Community

Roku Developer Program

Firmware 5.4 fixed ANSI Subtitles issue

Re: Firmware 5.4 fixed ANSI Subtitles issue

Re: Firmware 5.4 fixed ANSI Subtitles issue

Re: Firmware 5.4 fixed ANSI Subtitles issue

Re: Firmware 5.4 fixed ANSI Subtitles issue

Re: Firmware 5.4 fixed ANSI Subtitles issue

Re: Firmware 5.4 fixed ANSI Subtitles issue

Re: Firmware 5.4 fixed ANSI Subtitles issue

Re: Firmware 5.4 fixed ANSI Subtitles issue

Re: Firmware 5.4 fixed ANSI Subtitles issue