You are here: Re: Regular expression example on PHP.net « PHP Programming Language « IT news, forums, messages
Re: Regular expression example on PHP.net

Posted by Rik Wasmus on 09/07/07 07:56

On Fri, 07 Sep 2007 08:02:07 +0200, Zenofobe <fake_email@fake_domain.com=
> =

wrote:

> Howdy folks,
>
> On this page at php.net
> http://www.php.net/features.http-auth
> there's a regular expression in Example 34.2. It's supposed to parse =
out
> the different values being passed in the header. I know what it's
> supposed to do, so I have a vague idea of what's being done in the RE,=

> but I've been having a heck of a time figuring out what each part of t=
he
> RE is actually doing. Here's what I have so far:
>
> preg_match_all('@(\w+)=3D(?:([\'"])([^\2]+)\2|([^\s,]+))@', $txt, $mat=
ches,
> PREG_SET_ORDER);
>
> //'@
> //(\w+) Any word character (letter/digit/_), 1 or more
> //=3D Equal sign
> //(?: This submatch will not be captured (still available for
> later matching)
> //([\'"]) A single or double quote
> //([^\2]+) Not start of text (STX)?, 1 or more
> //\2|
> //([^\s,]+) Not whitespace or comma, 1 or more
> //)
> //@'

Quick tip for starting with regexes: use the x modifier, so you can =

comment this is in the regex itself for later.

preg_match_all('@ #starting delimiter
(\w+) #any word character (one er more) in match 1
=3D #literal '=3D'
(?: #start of non-capturing subpattern
([\'"]) #either \' or " in match 2
([^\2]+) #match one or more characters in match 3 that are NOT=
in =

match 2
\2 #match the same character as matched in 2
| #or
([^\s,]+) #character not whitespace or comma in match 4
) #end of non-capturing subpattern
@ #ending delimiter
x', $txt, $matches,PREG_SET_ORDER);

> I'm unclear as to what the second \2 does,

It's a 'reference' to the match allready captured in match 2

> as well as which parts the OR
> applies to.

The pattern seems to try to capture name/value pairs, where either the =

value is quoted with a ' or ", or consist of "characters not whitespace =
or =

comma". So it will match "foo=3D'bar'" & "foo=3Dbar", but in "foo=3Dbar =
baz" =

still only 'bar' will be matched in 4, not 'bar baz'.

> And what are the @s for?

(Almost) any character can be used as 'delimiter' of the pattern, usuall=
y =

/, but it's @ here. Being able to choose a delimiter for the pattern hel=
ps =

you to avoid having to quote an often matched character that is used as =
a =

delimiter. Any characters following the second delimiter (x in mine) wil=
l =

be considered modifiers to the pattern.
-- =

Rik Wasmus

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация