Roku Developer Program

Developers and content creators—a complete solution for growing an audience directly.
cancel
Showing results for 
Search instead for 
Did you mean: 
destruk
Level 10

Parsing JSON v XML

Parsing the content as JSON appears to be twice as fast as parsing XML.  It is a big change to the server code to switch existing feeds over to json, but the payoff is worth it.  I'll need to run some profiling now to determine optimization which will require two sets of server code to compare with two versions of channel code.

[spun off viewtopic.php?f=34&t=101573 --RokuNB]
0 Kudos
28 Replies
Highlighted
Roku Employee
Roku Employee

Re: ObserveField

"destruk" wrote:
Parsing the content as JSON appears to be twice as fast as parsing XML.  It is a big change to the server code to switch existing feeds over to json, but the payoff is worth it.  I'll need to run some profiling now to determine optimization which will require two sets of server code to compare with two versions of channel code.

Sharing your "field experience" switching between XML and JSON would be appreciated!
Things like contrast&compare size of the feeds in both formats, parse time on (low-end?) player, maybe download time etc
0 Kudos
destruk
Level 10

Re: Parsing JSON v XML

The profiler results - using the same feed between both methods -
JSON
 structuredata2()
51641 35612 87253 0.2950 0.1045 0.3996 1
 strreplace()
35612 0 35612 0.1045 0.0000 0.1045 417

XML
 structuredata()
77657 35542 113199 1.3676 0.1073 1.4749 1
 strreplace()
35542 0 35542 0.1073 0.0000 0.1073 416

Parsing 208 content items.  XML takes 1.4749, JSON takes 0.3996, so about 4 times faster to use json.  In actual operation it's the difference between an execution timeout with about 330 items, and successfully parsing 500+ items without a timeout (2 seconds vs 5+ seconds).

XML parse code:
Function StructureData(Source As String) As Object
content=createObject("roSGNode","ContentNode")
contentxml=createObject("roXMLElement")
contentxml.parse(Source)

categories=contentxml.GetChildElements()
categorynames=[]
categorynames.Push(categories[0]@name)
row=CreateObject("RoSGNode","ContentNode")
row.Title=categorynames[0]
If (categories[0].GetChildElements())<>invalid
categoryitems=categories[0].GetChildElements()
y=categoryitems.count()-1
For z=0 To y
item=CreateObject("RoSGNode","ContentNode")
item.ContentID=categories[0].entry[z].ContentID[0].GetBody()
item.Title=strreplace(categories[0].entry[z].Title[0].GetBody(),"_"," ")
item.Title=strreplace(item.Title,".mp4","")
item.Description=categories[0].entry[z].Description[0].GetBody()
item.Rating=categories[0].entry[z].Rating[0].GetBody()
item.Length=categories[0].entry[z].Length[0].GetBody()
item.ReleaseDate=categories[0].entry[z].ReleaseDate[0].GetBody()
item.Directors=categories[0].entry[z].Director[0].GetBody()
temp=[]
temp.Push(categories[0].entry[z].Actor1[0].GetBody())
temp.Push(categories[0].entry[z].Actor2[0].GetBody())
temp.Push(categories[0].entry[z].Actor3[0].GetBody())
item.Actors=temp
item.SDPosterURL=categories[0].entry[z].ThumbSD[0].GetBody()
item.HDPosterURL=categories[0].entry[z].ThumbHD[0].GetBody()
temp=categories[0].entry[z].StreamURL[0].GetBody()

addURL=""
If left(temp,7)<>"http://"
If Left(temp,10)="Animaniacs" addurl="Cartoons/"
If Left(temp,9)="Bananaman" addurl="Cartoons/"
If Left(temp,21)="Battle_of_the_Planets" addurl="Cartoons/"
If Left(temp,21)="Dungeons_and_Dragons/" addurl="Cartoons/"
If Left(temp,10)="Gatchaman/" addurl="Cartoons/"
If Left(temp,11)="Gatchaman 2" addurl="Cartoons/"
If Left(temp,11)="Gatchaman 3" addurl="Cartoons/"
If Left(temp,9)="Iron_Man/" addurl="Cartoons/"
If Left(temp,19)="Pinky_and_the_Brain" addurl="Cartoons/"
If Left(temp,21)="Pirates_of_Dark_Water" addurl="Cartoons/"
If Left(temp,6)="Shorts" addurl="Cartoons/"
If Left(temp,11)="Speed_Racer" addurl="Cartoons/"
If Left(temp,11)="Thundercats" addurl="Cartoons/"
If Left(temp,6)="X-Men/" addurl="Cartoons/"
item.url="http://"+m.global.serverprefix.+"/root/DVD/"+addURL+temp
Else
item.url=temp
End If
item.Genre=categories[0].entry[z].Genre[0].GetBody()
item.EpisodeNumber=categories[0].entry[z].EpisodeNumber[0].GetBody()
item.StreamFormat="mp4"
If Len(item.EpisodeNumber)=0
ServerO=Left(item.url,LEN(item.url)-3)+"bif"
ServerP=Left(item.url,LEN(item.url)-4)+"-SD.bif"
ServerN="http://192.168.1.9/"
item.HDBifURL=strreplace(ServerO,"http://"+m.global.serverprefix+"/root/",ServerN)
item.SDBifURL=strreplace(ServerP,"http://"+m.global.serverprefix+"/root/",ServerN)
End If
item.Album=categories[0].entry[z].BookmarkBrian[0].GetBody() 'Brian's Bookmark As String
item.Artist=categories[0].entry[z].BookmarkErin[0].GetBody() 'Erin's Bookmark As String
row.AppendChild(item)
Next
End If
content.AppendChild(row)
Return content
End Function



JSON parse code:
Function StructureData2(Source As String) As Object
content=createObject("roSGNode","ContentNode")
contentxml=createObject("roXMLElement")
contentxml=parseJSON(Source)

row=CreateObject("RoSGNode","ContentNode")
row.title=strreplace(contentxml["category name"],"_"," ")
If contentxml.entry.count()>0
cd=contentxml.entry[0]
y=cd.count()-1
For z=0 To y
item=CreateObject("RoSGNode","ContentNode")
item.ContentID=cd[z].ContentID
item.Title=strreplace(cd[z].Title,"_"," ")
item.Title=strreplace(item.Title,".mp4","")
item.Description=cd[z].Description
item.Rating=cd[z].Rating
item.Length=cd[z].Length
item.ReleaseDate=cd[z].ReleaseDate
item.Directors=cd[z].Director
temp=[]
temp.Push(cd[z].Actor1)
temp.Push(cd[z].Actor2)
temp.Push(cd[z].Actor3)
item.Actors=temp
item.SDPosterURL=cd[z].ThumbSD
item.HDPosterURL=cd[z].ThumbHD
temp=cd[z].StreamURL

addURL=""
If left(temp,7)<>"http://"
If Left(temp,10)="Animaniacs" addurl="Cartoons/"
If Left(temp,9)="Bananaman" addurl="Cartoons/"
If Left(temp,21)="Battle_of_the_Planets" addurl="Cartoons/"
If Left(temp,21)="Dungeons_and_Dragons/" addurl="Cartoons/"
If Left(temp,10)="Gatchaman/" addurl="Cartoons/"
If Left(temp,11)="Gatchaman 2" addurl="Cartoons/"
If Left(temp,11)="Gatchaman 3" addurl="Cartoons/"
If Left(temp,9)="Iron_Man/" addurl="Cartoons/"
If Left(temp,19)="Pinky_and_the_Brain" addurl="Cartoons/"
If Left(temp,21)="Pirates_of_Dark_Water" addurl="Cartoons/"
If Left(temp,6)="Shorts" addurl="Cartoons/"
If Left(temp,11)="Speed_Racer" addurl="Cartoons/"
If Left(temp,11)="Thundercats" addurl="Cartoons/"
If Left(temp,6)="X-Men/" addurl="Cartoons/"
item.url="http://"+m.global.serverprefix.+"/root/DVD/"+addURL+temp
Else
item.url=temp
End If

item.Genre=cd[z].Genre
item.EpisodeNumber=cd[z].EpisodeNumber
item.StreamFormat="mp4"
item.Album=cd[z].BookmarkBrian 'Brian's Bookmark As String
item.Artist=cd[z].BookmarkErin 'Erin's Bookmark As String
row.AppendChild(item)
Next
End If
content.AppendChild(row)
Return content
End Function

0 Kudos
destruk
Level 10

Re: Parsing JSON v XML

And yes, for the JSON version I didn't do the work per item for the bif files, as we don't use bif files for these.  But a single If statement routine that wouldn't have been executed on the xml feed for the test content anyway wouldn't have much impact on the actual test results so the test comparison should still be valid.
0 Kudos
Roku Employee
Roku Employee

Re: Parsing JSON v XML

"destruk" wrote:
Parsing 208 content items.  XML takes 1.4749, JSON takes 0.3996, so about 4 times faster to use json.  In actual operation it's the difference between an execution timeout with about 330 items, and successfully parsing 500+ items without a timeout (2 seconds vs 5+ seconds).

I have to admit, the profiler numbers don't speak to me - no idea what they show - but i am happy with your summary. Any observations on the file sizes and transfer times?

Timeouts? Where are you getting timeouts... not in a Task i hope, that should never time out?
0 Kudos
destruk
Level 10

Re: Parsing JSON v XML

The second value from the right is the total time consumed, so
The profiler results - using the same feed between both methods -
JSON takes 0.3996
XML takes 1.4749

And no, it's not doing these in a task because the task execution priority is way too slow at 6+ seconds to do this same thing.  The task does the download, and thread 2 does the parsing.  The threads don't time out if you're parsing fewer than 350 items with xml, or fewer than 500 or so items with JSON, so that is sufficient for my needs.

File sizes for the feeds are 300-400 KB.  Transfer times are <1 second for either of those - basically both are insignificant to me for consideration - the profiler numbers were what I was interested in for accurate measurement.  What I mean is - since the download itself is in a task node and it does its job, and then throws what it downloaded over to the parsing routine in a different thread, the transfer time and file size isn't what I was measuring.
0 Kudos
Roku Employee
Roku Employee

Re: Parsing JSON v XML

"destruk" wrote:
And no, it's not doing these in a task because the task execution priority is way too slow at 6+ seconds to do this same thing.  The task does the download, and thread 2 does the parsing.  The threads don't time out if you're parsing fewer than 350 items with xml, or fewer than 500 or so items with JSON, so that is sufficient for my needs.

Thanks for the other info.
I am at a loss on this though - what do you mean by "thread 2" or "throws what it downloaded over to the parsing routine in a different thread"? How can you create another thread?

From what i know, you have 3 kinds of threads:

  • one main() thread

  • one render thread (where you shouldn't parsing anything and i thought these functions are blocked there)

  • task threads

So if my "math" is right, you can only parse XML/JSON in the main thread or a task thread - and neither of these should be timing out. So there is something i am getting wrong - what is it?
0 Kudos
destruk
Level 10

Re: Parsing JSON v XML

On the profiler page for Roku it breaks everything down to multiple threads.
If you look at it it numbers the threads on the left column -
https://github.com/rokudev/docs/blob/ma ... ctober2016

You see it has Thread Main, Thread 6, Thread 16, Thread 9, Thread, etc etc etc.
More than 3.

If your function takes more than 3 seconds in the main thread, you get an execution timeout and it crashes the channel.  For the task thread it is much slower but shouldn't have an execution timeout 'feature'.
0 Kudos
destruk
Level 10

Re: Parsing JSON v XML

I would ask that if you are going to go in and chop out parsing options from the main render thread, then you're going to have to make the task thread priority and timeslice bigger.  Waiting around for 6+ seconds to parse a list is a stupid thing to enforce when the render thread isn't doing anything but drawing a progress animation on the screen waiting around for the parse to finish from the super slow task thread.
0 Kudos
Roku Employee
Roku Employee

Re: Parsing JSON v XML

"destruk" wrote:
On the profiler page for Roku [...] You see it has Thread Main, Thread 6, Thread 16, Thread 9, Thread, etc etc etc.
More than 3.

The player may start its own threads for internal purposes, but as app developers we either deal with the main thread, or the render thread, or one of the task threads. So my question was in which ones of these your channel gets killed for timeouts - i know only of the render thread doing that. Please be very specific, short example code appreciated.

"destruk" wrote:
If your function takes more than 3 seconds in the main thread, you get an execution timeout and it crashes the channel.  For the task thread it is much slower but shouldn't have an execution timeout 'feature'.

I would ask that if you are going to go in and chop out parsing options from the main render thread, then you're going to have to make the task thread priority and timeslice bigger.  Waiting around for 6+ seconds to parse a list is a stupid thing to enforce when the render thread isn't doing anything but drawing a progress animation on the screen waiting around for the parse to finish from the super slow task thread.

My understanding is this shouldn't be happening. Can you provide details - firmware (7.6? is it regression from 7.5?), player model etc

PS. i just checked with a person in the know and there are no watchdogs/timeouts on the main and render threads, your app should not be kicked for that. The render thread does time out if blocked for >3 sec side-loaded (or 10 sec in production), so don't do that Smiley Happy. Think of it as a hot potato - don't grab and hold it! It has a job to do, churning 30-60 frames per second, to that a timeout of 3sec is >200x more
0 Kudos