You are here: Re: php regular expression doesn't match « PHP Programming Language « IT news, forums, messages
Re: php regular expression doesn't match

Posted by Erwin Moller on 10/26/07 09:31

Steve wrote:
> "Erwin Moller"
> <Since_humans_read_this_I_am_spammed_too_much@spamyourself.com> wrote in
> message news:47205dcb$0$234$e4fe514c@news.xs4all.nl...
>> Erwin Moller wrote:
>>> Steve wrote:
>>> So where gives
>>> /a[^b]*b/
>>> gives a different result than
>>> /a[^b]*?/
>>>
>> should be:
>> Can you give an example where
>> /a[^b]*b/
>> differs from:
>> /a[^b]*?b/
>> of course.
>>
>> Sloppy typing. Still the same coffee problem. ;-)
>
> ahhh...the damned coffee. :)
>
> it is interpreted differently in different engines. in preg, however, what
> you think you're saying is NOT what you're saying.
>
> 'aabb' may be a string. your pattern should return three matches. 1) aab, 2)
> ab and 3) aabb. this is because of greed inherent in your statement - which
> is what the op wanted to know about anyway. using this:
>
> /a[^b]*?b/
>
> keeps the greed at bay. essentially, find an 'a' and any character until you
> hit ONE 'b'. so, the above would have two matches...'aab' and 'ab'. it's all
> about setting the marker in the preg engine. from that spot, the next set of
> matching will begin. you're throwing yours down the street, when all you
> needed to do was slide the mug down the bar counter.
>
> no big deal? try it with preg_replace. ;^)

Hi Steve,

Sorry, I hope you can keep this regex class running a little while
longer, because I still don't get it.
I DID try it in preg_replace, and got excactly the result I expected.

Lets look at your example: aabb
You talk about 3 matches, being:
1) aab
2) ab
3) aabb

I do not get that.
Look at the the following example: It uses my own homebrew version of
nongreedy *, being [^b]*b

$str="aabb";
$str=preg_replace("/a[^b]*b/", "peter", $str);
echo htmlentities($str);

That produces:
peterb

as exepted (by me at least), because * behaves nongreedy.


As far as I can tell, the feeded string starts matching as follows
(based on what I learned in 'Mastering Regular Expression' book):

- a matches right away
- next another a: matches with the non-b class [^b] (one time)
- Third character is a b
This DOESN'T match the non-b class, so the engine runs on the see if it
fits the next character, being a plain b in this example.
It matches (of course). So now we have a match (aab) and that gets
replaced by 'peter'.
- Then comes a 'b', that doesn't match the /a[^b]*b/
- end of string, and replacements.

So the result we have now is: peterb.

So I don't see your point about the 3 matches (aab,ab,aabb).

I can imagine my approach is somehow hugely ineficient maybe in other
cases/strings (I am not sure), but it does work OK.
Is that the case maybe? Efficiency?

As far as I can tell the /[^b]*b/ approach creates a nongreedy version
of /b*/

I must still miss something.
Could you give me an example where the results differ?
Or tell me WHAT I am missing?

As you can tell, I didn't finish my 'Mastering Regular Expression' book
yet. ;-)


>
> as for my mood? it's pretty consistent. okham's razor would have it that
> more likely, two days ago, there were several people saying stupid things.
> being consistent, i correct stupid things being said. but, you own your
> perspective. see things how you will.

That fine. I don't behave nicely on usenet either all the time. ;-)


Regards,
Erwin

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация