Forum Discussion

quartern's avatar
quartern
Visitor
10 years ago

Does roByteArray:FromBase64String() not like UTF8?

Hi

I'm trying to use SOME data that can be retrieved online from a base64 encoded JSON
a little confused because yesterday a version of this seemed to have worked and now it does not
as such I took a snapshot of the data that doesn't work and stored as a gist (to have a repeatable test)

I KNOW I wont be able to render the international characters and will live with it (for now)
but the base64 decode seems to have SOME issue (is it the non-ascii, or overall size - I dont know)

Using online base64 test tool (https://www.base64decode.org/) & JSON validateion tools (https://jsonformatter.curiousconcept.com/)
I surmize that the data itself is actually fine. I even took a decoded version
and served it up to this roku test (bypassing the base64 decode)
and ParseJSON() didnt seem to have any issues with the international characters


Function test()
request = CreateObject("roUrlTransfer")
request.SetCertificatesFile("common:/certs/ca-bundle.crt")
request.AddHeader("X-Roku-Reserved-Dev-Id", "")
request.InitClientCertificates()
request.SetUrl("https://gist.githubusercontent.com/quartern/c4bb656c005ded4b3c3bead99dd5b71a/raw/e35ae64009457d050e24cbdf0ebf22fc8d4aecd8/gistfile1.txt")

tmpfile="tmp:/quartern_resp.txt"
request.GetToFile(tmpfile)
ba = CreateObject("roByteArray")
if not ba.ReadFile(tmpfile)
print "ReadFile failed"
else if ba.count() < 1
print "Empty response"
else
' we know we get these with quotes - strip them
txt=ba.ToAsciiString()
print "B4 STRIP >>>"+txt.left(10)+"..."+txt.right(10)+"<<<"
ba.pop()
ba.shift()
txt=ba.ToAsciiString()
print "AFTER STRIP >>>"+txt.left(10)+"..."+txt.right(10)+"<<<, count=",ba.count()
' decode and parse
ba.FromBase64String(ba.ToAsciiString())
txt=ba.ToAsciiString()
print "AFTER DECODE >>>"+txt.left(10)+"..."+txt.right(10)+"<<<, count=",ba.count()
theData=ParseJSON(ba.ToAsciiString())
endif

DeleteFile(tmpfile)
theData.x=1 ' just force debugger on fail
return theData
end Function




Here is the relevant output in the debugger:
B4 STRIP >>>"eyJ3Y2ZJc...lNiI6W119"<<<
AFTER STRIP >>>eyJ3Y2ZJcC...xlNiI6W119<<<, count= 271071
AFTER DECODE >>>{"wcfIp":[...??? ??? ??<<<, count= 203301
BRIGHTSCRIPT: ERROR: ParseJSON: Unterminated string: ...


The size ration seems reasonable but the end of that decoded string should be something like
,"Table6":[]}

so the question marks do not make sense

Are there any size or character limitations on the base64 encoded data?

Thanks
--QuarterN

4 Replies

  • I Think I found it,

    The input included a few backslash characters to escape a forward slash (forward slash is valid base64 char,
    why the server escapes it within a string? maybe a javascript thing - IDK)
    15k\/INek16jXpyAxIiwibWVkaWFfY29kZSI6Ijg4OTIzNyIsIm1lZGlhX2Rlc2MiOiLXkNeZ16TXlCD


    It would be great if the Base64 decoder would print the value and offset of the character on which it choked

    The only way for me to locate this was to implement/port base64 decoder (included here for "fun")

    ' based on https://en.wikibooks.org/wiki/Algorithm_Implementation/Miscellaneous/Base64
    Function Decode64FromFile(fname as String) as String
    WHITESPACE_CHR=64
    EQUALS_CHR=65
    INVALID_CHR=66

    d = [
    66,66,66,66,66,66,66,66,66,66,64,66,66,66,66,66,66,66,66,66,66,66,66,66,66,
    66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,62,66,66,66,63,52,53,
    54,55,56,57,58,59,60,61,66,66,66,65,66,66,66, 0, 1, 2, 3, 4, 5, 6, 7, 8, 9,
    10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,66,66,66,66,66,66,26,27,28,
    29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,66,66,
    66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,
    66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,
    66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,
    66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,
    66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,66,
    66,66,66,66,66,66
    ]

    iter = 0
    buf = 0
    currLen = 0

    baIn = CreateObject("roByteArray")
    baOut = CreateObject("roByteArray")
    baIn.ReadFile(fname)

    inCount=0
    for each inByte in baIn
    c = d[inByte]
    inCount+=1

    if c=INVALID_CHR
    print "Invalid character X at postion Y", inByte, inCount-1
    else if c=WHITESPACE_CHR
    ' ignore
    print "ignore whitespace"
    else if c=EQUALS_CHR
    exit for
    else
    buf = (buf<<6) or c
    iter+=1 ' increment the number of iteration
    ' If the buffer is full, split it into bytes
    if iter = 4
    currLen += 3
    baOut.push((buf >> 16) and 255)
    baOut.push((buf >> 😎 and 255)
    baOut.push(buf and 255)
    buf = 0
    iter = 0
    end if
    end if
    end for

    if iter = 3
    currLen += 2
    baOut.push((buf >> 10) and 255)
    baOut.push((buf >> 2) and 255)
    else if iter = 2
    currLen += 1
    baOut.push((buf >> 4) and 255)
    end if

    return baOut.ToAsciiString()
    end function
  • amazing, works better than the builtin FromBase64String

    Thank you so much !
  • quartern -
    this is a VERY interesting puzzle you ran into!

    You were using PHP^, yes? :oops: Its json_encode() escapes slashes.

    It actually took "double whammy" for this to happen - your second mistake was not doing parse_json() on the downloaded JSON but instead using .shift() & .pop() shenanigans to get rid of the quotes. Which left the \/ inside, something parse would have taken care of.

    That's why i am amused by the case, it's a "twofer" that popped in completely unrelated place (base64 decoding).

    (^) Friends don't let friends use PHP!
    On the other hand, were i in this trade, i would actively encourage my competition to use PHP and MySql...
  • "Arthy74" wrote:
    amazing, works better than the builtin FromBase64String

    Thank you so much !


    U R Welcome - its good for debug - but for me it was not faster (though I just timed a modified version that does it in-memory no via file, I should re-test with the above)