Forum Discussion

cool_dude's avatar
cool_dude
Visitor
13 years ago

Scraping Videos off some random website - Example? Tutorial

Hello Guys,

I played little bit with videoplayer today and learned some stuffs...but I would be grateful to all of you who can help me with Scraping stuffs. I want to scrap videos from some public websites to my channel (live stream and/or recorded videos). Does any of you have a tutorial or example code that does what I am looking for? Thanks in advance.

Regards,
Cool_Dude

4 Replies

  • "cool_dude" wrote:
    Hello Guys,

    I played little bit with videoplayer today and learned some stuffs...but I would be grateful to all of you who can help me with Scraping stuffs. I want to scrap videos from some public websites to my channel (live stream and/or recorded videos). Does any of you have a tutorial or example code that does what I am looking for? Thanks in advance.

    Regards,
    Cool_Dude

    There's a few issues here. First off, Roku can't be seen as supporting piracy, thanks to the DMCA. So they cannot comment to help you if they're aware that you would be using copyrighted material in a manner that isn't approved by the copyrighted holder. And it isn't possible to give you specific, helpful answers without knowing which specific sites & pages you want to scrape videos from. The structure of nearly every site on the internet is different from every other one, so the specific method of scraping the video will be different too.

    If the sites you want to use provide public domain or otherwise Free content(e.g. content licensed with a Creative Commons license), then you could share the specific sites here and people could potentially give you specific, helpful answers. It's much too broad of a question otherwise.

    I know this is probably not the answer you wanted.
  • RokuChris already gave you some pointers (that you responded to) in the other thread you created where you asked the exact same question: viewtopic.php?f=34&t=54174
    roUrlTransfer is the component you'd need to grab the HTML from the site you want to scrape. Extracting the data from that HTML, as gonzotek mentioned, is entirely dependent on the site itself.

    That being said, the Roku is very limited in the video encodings that it supports, particularly when it comes to live streams. Generally speaking, if it plays on a iOS device, in most cases, it'll play on the Roku. Otherwise, you may not have much luck.
  • The easiest way I've found to scrape web sites is to do the scraping in a script (e.g. PHP) on a server, extract the data you're interested in and return it as XML in response to a query from the Roku.

    For example, PHP's loadHTML function can parse most HTML (even if not well-formed), returning a DOMDocument object, which you can examine using DOMDocument methods to extract your data. You can create the XML to send back to your Roku using the XMLWriter class, for example.

    I like this method because the Roku is very good at parsing XML data, and other scripting languages make it easier to parse the HTML. Also, if you find that the web site you are scraping changes the way its pages are laid out, you can make the corresponding change in your PHP script which takes effect immediately, rather than having to update your Roku channel which takes some time to propagate to your users.