|
Posted by gosha bine on 09/07/07 14:03
On 07.09.2007 15:29 Rik Wasmus wrote:
> On Fri, 07 Sep 2007 15:07:52 +0200, gosha bine <stereofrog@gmail.com>
> wrote:
>
>> On 07.09.2007 14:55 Rik Wasmus wrote:
>>> On Fri, 07 Sep 2007 11:47:58 +0200, gosha bine <stereofrog@gmail.com>
>>> wrote:
>>>
>>>> On 07.09.2007 09:56 Rik Wasmus wrote:
>>>>
>>>>> ([^\2]+) #match one or more characters in match 3 that are
>>>>> NOT in match 2
>>>>
>>>> [^\2] doesn't mean "negate group 2" as you and the manual people
>>>> seem to think. It means "any character except that with ascii code 2".
>>> Hmmz, a quick check indicates you're right, mea culpa.
>>> The manual iq quite confusing at this point though:
>>> "Inside a character class, or if the decimal number is greater than 9
>>> and there have not been that many capturing subpatterns, PCRE
>>> re-reads up to three octal digits following the backslash, and
>>> generates a single byte from the least significant 8 bits of the
>>> value. Any subsequent digits stand for themselves. For example:
>>> ....
>>> \7
>>> is always a back reference
>>> \11
>>> might be a back reference, or another way of writing a tab"
>>
>> Well, it's clear enough: "Inside a character class..."
>
> Yes it states "inside a character class IF THE NUMBER IS GREATER THEN 9"
> And continues on saying that inside a character class \7 should still be
> a backreference..
No, just read it again. "Inside a character class OR if the decimal
number etc". Can everybody see OR? ;)))
The wording is unambiguous, but I agree it might be confusing.
>
>>> According to this, I'd expect it to be a back reference. Which
>>> brings me to the question: what is the way to get a beckreference
>>> into a negated character class, if there is one?
>>>
>>
>> Character classes are... hm... classes of _characters_, there's no way
>> to put references (which are _strings_) there.
>
> Allthough single characters can, and that's all we're after, we're not
> matching 'a specific string' just 'not any collection of characters',
> but I see your point. It should be done with something like
> '/=(\'|").*?\1/', allthough escaped (by \) quoting characters require
> some more care (always 'ignore' a single character after '\')
I'd use the class for quotes (\\w+)=([\'"])(.*?)\\1
As for escaping, it's practical not to rely on (fuzzy) php escaping
rules and to double every pcre-specific slash.
--
gosha bine
makrell ~ http://www.tagarga.com/blok/makrell
php done right ;) http://code.google.com/p/pihipi
Navigation:
[Reply to this message]
|