| 
	
 | 
 Posted by McHenry on 06/25/06 08:05 
"Rik" <luiheidsgoeroe@hotmail.com> wrote in message  
news:aac61$449e3a5b$8259c69c$14679@news2.tudelft.nl... 
> McHenry wrote: 
>>> This works great however when I try to view the contents of the 
>>> array I am only presented with a single element: 
> 
>>> Here is the code I am using: 
>>> 
>>> $pattern='%<div[^>]*?class="overview"[^>]*?>                 #start 
>>> of overview '; 
>>> $pattern=$pattern.'.*? 
> 
> The comment is between # and a newline. As you concat everything in stead  
> of 
> just newlining it inside the quotes, the expressions breaks. Why do you 
> concat by the way? 
 
I thought this was the way I had to do it... (new to php, new to Linux, new  
to many things) 
Now I understand, I thought the comments were part of the regex and couldn't  
understand how it worked... :) 
 
> 
>> Maybe it should have been obvious but I missed it anyway I removed the 
>> comments from inside the pattern string and it now works. 
>> 
>> I love the concept of the named match which makes it very easy to 
>> reference in an array, very powerfull. 
>> 
>> Within the header I have a field I would like to capture between 
>> <h1>field_here</h1> I suspected I could achieve this by replacing: 
>> (?P<header>.*?(?:<div[^>]*?>.*?</div>.*?)*) 
>> 
>> with 
>> 
>> (?P<header>.*?(?:<h2[^>]*?>.*?</h2>.*?)*) 
>> 
>> however nothing changed when I printed the array value of  'header'? 
> 
> That's correct behaviour, (:? means a NON capturing pattern. 
 
Your original solution used (?: not (:? is there a difference or is this a  
typo ? 
 
> 
> If you only want the <h1> field form the header-div: 
> 
>   <div[^>]*?class="header"[^>]*> 
>     .*?(:?<div[^>]*>.*?</div>.*?)*? 
>     <h1>(?P<header>.*?)</h1> 
>     .*?(:?<div[^>]*>.*?</div>.*?)*? 
>   </div> 
 
Why do you use a ? after a * I would have thought the usage of these would  
be mutually exclusive, for example my understanding of  
<div[^>]*?class="header"[^>]*> is: 
 
match the pattern <div 
match any character other than > 
match 0 or more of the previous expression 
match 0 or 1 of the previous expression 
match the pattern class="header" 
match any character other than > 
match 0 or more of the previous expression 
match the pattern > 
 
I appreciate your assistance... 
 
> 
> 
> If you want the whole header-div and the h2-field again in a seperate div: 
>   <div[^>]*?class="header"[^>]*> 
>     (?P<header>.*?(:?<div[^>]*>.*?</div>.*?)*? 
>     <h1>(?P<h1>.*?)</h1> 
>     .*?(:?<div[^>]*>.*?</div>.*?)*?) 
>   </div> 
> 
> Grtz, 
> --  
> Rik Wasmus 
> 
>
 
  
Navigation:
[Reply to this message] 
 |