|
Posted by Erwin Moller on 10/26/07 09:31
Steve wrote:
> "Erwin Moller"
> <Since_humans_read_this_I_am_spammed_too_much@spamyourself.com> wrote in
> message news:47205dcb$0$234$e4fe514c@news.xs4all.nl...
>> Erwin Moller wrote:
>>> Steve wrote:
>>> So where gives
>>> /a[^b]*b/
>>> gives a different result than
>>> /a[^b]*?/
>>>
>> should be:
>> Can you give an example where
>> /a[^b]*b/
>> differs from:
>> /a[^b]*?b/
>> of course.
>>
>> Sloppy typing. Still the same coffee problem. ;-)
>
> ahhh...the damned coffee. :)
>
> it is interpreted differently in different engines. in preg, however, what
> you think you're saying is NOT what you're saying.
>
> 'aabb' may be a string. your pattern should return three matches. 1) aab, 2)
> ab and 3) aabb. this is because of greed inherent in your statement - which
> is what the op wanted to know about anyway. using this:
>
> /a[^b]*?b/
>
> keeps the greed at bay. essentially, find an 'a' and any character until you
> hit ONE 'b'. so, the above would have two matches...'aab' and 'ab'. it's all
> about setting the marker in the preg engine. from that spot, the next set of
> matching will begin. you're throwing yours down the street, when all you
> needed to do was slide the mug down the bar counter.
>
> no big deal? try it with preg_replace. ;^)
Hi Steve,
Sorry, I hope you can keep this regex class running a little while
longer, because I still don't get it.
I DID try it in preg_replace, and got excactly the result I expected.
Lets look at your example: aabb
You talk about 3 matches, being:
1) aab
2) ab
3) aabb
I do not get that.
Look at the the following example: It uses my own homebrew version of
nongreedy *, being [^b]*b
$str="aabb";
$str=preg_replace("/a[^b]*b/", "peter", $str);
echo htmlentities($str);
That produces:
peterb
as exepted (by me at least), because * behaves nongreedy.
As far as I can tell, the feeded string starts matching as follows
(based on what I learned in 'Mastering Regular Expression' book):
- a matches right away
- next another a: matches with the non-b class [^b] (one time)
- Third character is a b
This DOESN'T match the non-b class, so the engine runs on the see if it
fits the next character, being a plain b in this example.
It matches (of course). So now we have a match (aab) and that gets
replaced by 'peter'.
- Then comes a 'b', that doesn't match the /a[^b]*b/
- end of string, and replacements.
So the result we have now is: peterb.
So I don't see your point about the 3 matches (aab,ab,aabb).
I can imagine my approach is somehow hugely ineficient maybe in other
cases/strings (I am not sure), but it does work OK.
Is that the case maybe? Efficiency?
As far as I can tell the /[^b]*b/ approach creates a nongreedy version
of /b*/
I must still miss something.
Could you give me an example where the results differ?
Or tell me WHAT I am missing?
As you can tell, I didn't finish my 'Mastering Regular Expression' book
yet. ;-)
>
> as for my mood? it's pretty consistent. okham's razor would have it that
> more likely, two days ago, there were several people saying stupid things.
> being consistent, i correct stupid things being said. but, you own your
> perspective. see things how you will.
That fine. I don't behave nicely on usenet either all the time. ;-)
Regards,
Erwin
Navigation:
[Reply to this message]
|