|
Posted by Rik on 03/01/07 23:20
Damo <cormacdebarra@gmail.com> wrote:
>> First, do you really need the whitespace around (.+)?
> Does the presence/absence of whitespace make a difference... As I said=
> I'm new to regex
Yes, it will match that whitespace unless the /x modifier is set.
>> Second,$document must be a string, not a handle on the file.
> $document is a handle on a URL taht I was reading in , so ye it was
> just a string
>> Third, your regular expression as written is greedy; is this =
>> intentional?
> There was no ? at the end of my regular expression it was just (.+)
Yes, so it's greedy. It will match as much as possible untill the second=
=
match.
consider:
'<a>foo</a>bar<a>baz</a>foz'
'|<a>.+</a>|' will match '<a>foo</a>bar<a>baz</a>'
'|<a>.+?</a>|' will match '<a>foo</a>'
For a lot of info about regular expressions: =
<http://www.regularexpressions.info>
In your case, I'd possibly use:
$regexp =3D "%<table[^>]*>(.+?)<img%si";
(the /i modifier will make the dot match linebreaks, which is possibly t=
he =
breaking point for your regex).
Highly depends on the actual markup wether this will work though...
-- =
Rik Wasmus
Navigation:
[Reply to this message]
|