Roku Developer Program

Developers and content creators—a complete solution for growing an audience directly.
cancel
Showing results for 
Search instead for 
Did you mean: 
mainmanc
Level 7

roRegex question

Hi,

I've only worked with regex in PHP (and very little at that Smiley Surprisedops: ), so I was wondering how I would do this using roRegex:

regex = CreateObject("roRegex", "!<form.*?action="(.*?)"!ms", "i")


I keep getting an error thrown on launch (code &h02). Any suggestions, or resources I could browse, on how I could use this expression with roRegex?

Thanks in advance.
0 Kudos
15 Replies
TheEndless
Level 7

Re: roRegex question

Try this...

regex = CreateObject("roRegex", "!<form.*?action=" + Chr(34) + "(.*?)" + Chr(34) + "!ms", "i")

The quotes in your string were causing the syntax error.
My Channels: http://roku.permanence.com - Twitter: @TheEndlessDev
Instant Watch Browser (NetflixIWB), Aquarium Screensaver (AQUARIUM), Clever Clocks Screensaver (CLEVERCLOCKS), iTunes Podcasts (ITPC), My Channels (MYCHANNELS)
0 Kudos
mainmanc
Level 7

Re: roRegex question

Many thanks! Smiley Wink
0 Kudos
mainmanc
Level 7

Re: roRegex question

Hi,

Thanks again for the help. It certainly solved the first issue.

I am still struggling with getting the roRegex to work properly for my needs. I am basing it on a piece that was previously used in PHP, so that may be the issue. I wanted to double check that it was not a possible bug. Because no matter how I approach it, I am still not receiving any results.

Here is the "string" I am loading in. I am certain the data is reaching the code, as this is a copy and paste from the actual print statement:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//WAPFORUM//DTD XHTML Mobile 1.0//EN" "http://www.wapforum.org/DTD/xhtml-mobile10.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
<meta name="viewport" content="width=device-width,minimum-scale=1.0,maximum-scale=1.0" />
<title>Untitled</title>
</head>
<body style="font-size:small;font-weight:normal;">
<div id="gb" class="body">
<form id="glf"
action=
"https://www.mysite.com/accounts/loginauth"
method="post"
>
<input type="hidden" name="atmpb"
value="665" />
<div align="left">
<font color="red">
</font>
</div>
</body>
</html>


Here is the code responsible for parsing the "string" returned from a AsyncGetToString call:

strResult = msg.GetString()

'printing the string result works just fine
print strResult

match_exp = "!<form.*?action=" + Chr(34) + "(.*?)" + Chr(34) + "!"
regex = CreateObject("roRegex", match_exp, "ims")
arr = regex.Match(strResult)

'always contains invalid at any position
print "ismatch->";arr[1]


The result is always "invalid", though it should be the value of the "action" attribute. This is the case for any other array slot as well.

Admittedly, I am not very well versed with regular expressions. But as mentioned, it does work in my exsisting PHP script.

Thanks in advance for any assistance.

Cheers.
0 Kudos
TheEndless
Level 7

Re: roRegex question

This works for me:

match_exp = "<form.*?action=.*?" + Chr(34) + "(.*?)" + Chr(34)

I removed the !'s and added the any character match after the "action=".
My Channels: http://roku.permanence.com - Twitter: @TheEndlessDev
Instant Watch Browser (NetflixIWB), Aquarium Screensaver (AQUARIUM), Clever Clocks Screensaver (CLEVERCLOCKS), iTunes Podcasts (ITPC), My Channels (MYCHANNELS)
0 Kudos
mainmanc
Level 7

Re: roRegex question

Ahh ok. So I assume it needed the extra .*? because of the line feeds in the string?

You've been a great help. I hope to offer the same to others as well. Smiley Happy

Cheers.
0 Kudos
RokuMarkn
Level 7

Re: roRegex question

You might want to make it match only spaces, carriage returns and line feeds, rather than any characters, between the = and the quote. You might end up matching a bunch of unrelated text between the = and the quote, which isn't what you want.

Also, I haven't tried this but it doesn't seem like the question marks should be necessary. ".*" already matches zero or more characters, so the following question mark doesn't do anything.

--Mark
0 Kudos
mainmanc
Level 7

Re: roRegex question

Thank you for the tips. Again, I am not very familiar with regex (but learning quickly now :mrgreen: ) so every bit helps.

Since we are on the subject of roRegex, I was trying to figure out an approach to emulating the preg_match_all

I was hoping someone might point me the right direction. I don't mind trying to figure it out, if there isn't a quick answer, but I'm certainly scratching my head on the best starting point. I am assuming a mixture of Parse and Match, with possibly an Array to "tear" pieces of the string out of. I'll certainly share whatever I find, if I can at least get heading in the right direction.

Thanks again! I'm hoping that this information might be useful in the future to others, as well as myself. Smiley Very Happy

Cheers.
0 Kudos
TheEndless
Level 7

Re: roRegex question

"RokuMarkn" wrote:
".*" already matches zero or more characters, so the following question mark doesn't do anything.

The question mark makes the match "lazy", so it doesn't go crazy and keep matching characters past the first double quote it sees. It may or may not be necessary in this case, but it seems safer.
My Channels: http://roku.permanence.com - Twitter: @TheEndlessDev
Instant Watch Browser (NetflixIWB), Aquarium Screensaver (AQUARIUM), Clever Clocks Screensaver (CLEVERCLOCKS), iTunes Podcasts (ITPC), My Channels (MYCHANNELS)
0 Kudos
TheEndless
Level 7

Re: roRegex question

"mainmanc" wrote:
Since we are on the subject of roRegex, I was trying to figure out an approach to emulating the preg_match_all

Assuming you just want to capture every matched value, there may be a simpler way, but this is the brute force method I used... this does modify the original string, so you may want to use a copy instead.

values = []
matches = regex.Match( response )
iLoop = 0
While matches.Count() > 1
values.Push( matches[ 1 ] )

' remove this instance, so we can get the next match
response = regex.Replace( response, "" )
matches = regex.Match( response )

' if we've looped more than 500 times, then we're
' probably stuck, so exit
iLoop = iLoop + 1
If iLoop > 500 Then
Exit While
End If
End While
My Channels: http://roku.permanence.com - Twitter: @TheEndlessDev
Instant Watch Browser (NetflixIWB), Aquarium Screensaver (AQUARIUM), Clever Clocks Screensaver (CLEVERCLOCKS), iTunes Podcasts (ITPC), My Channels (MYCHANNELS)
0 Kudos