jbrave
15 years agoChannel Surfer
html parsing with Regex
I'm attempting to use this:
which looks like this:
to parse html. Now as I understand it, using:
where html is an html page, should give me an array of all tags in the HTML, but I"m getting nothing back.
- Joel
"/<\/?\w+((\s+(\w|\w[\w-]*\w)(\s*=\s*(?:\".*?\"|'.*?'|[^'\">\s]+))?)+\s*|\s*)\/?>/i"
which looks like this:
"/<\/?\w+((\s+(\w|\w[\w-]*\w)(\s*=\s*(?:\"+chr(34)+".*?\"+chr(34)+"|"+chr(39)+".*?"+chr(39)+"|[^"+chr(39)+"\"+chr(34)+">\s]+))?)+\s*|\s*)\/?>/i"
to parse html. Now as I understand it, using:
test=createobject("roregex","/<\/?\w+((\s+(\w|\w[\w-]*\w)(\s*=\s*(?:\"+chr(34)+".*?\"+chr(34)+"|"+chr(39)+".*?"+chr(39)+"|[^"+chr(39)+"\"+chr(34)+">\s]+))?)+\s*|\s*)\/?>/i","m")
result=test.match(html)
where html is an html page, should give me an array of all tags in the HTML, but I"m getting nothing back.
- Joel