| 
	
 | 
 Posted by McHenry on 06/26/06 12:56 
"Rik" <luiheidsgoeroe@hotmail.com> wrote in message  
news:d309d$449fd595$8259c69c$5932@news1.tudelft.nl... 
> McHenry wrote: 
>> <h2>Field1</h2> 
>> 
>> <h3> 
>> 
>>  $123,456.78 - $987,654.32 
>> 
>>  </h3> 
>> 
>> I would like to capture Field1 and the first numeric value only. 
>> I have created the following that works somewhat: 
>>                                 $pattern='%<h2>(?P<field1>.*?)</h2> 
>>                                           .*? 
> 
>> 
>>  <h3>.*?\$(?P<field2>.*?)\s.*?</h3> %six'; However I would like to 
>> improve field2's capture to be the first series of numbers after <h3> 
>> excluding the thousand seperator and stop the capture as soon as a 
>> non numeric is encountered other than the decimal point, I cannot 
>> depend on the dollar sign always being present, so in this case I'd 
>> capture 123456.78 
>> 
>> Thanks in advance... 
> 
 
Rik I started a new thread as I had asked you enough and didn't want to push  
your generosity, having said this I am glad you responded, thanks. 
 
> simple one, capture at least 1 number, fo9llowed by numbers, decimal- or 
> thousand-seperator: 
> <h3>.*?(?P<field2>[0-9]+[0-9\.,]*).*?</h3> 
 
I'll stick to this one as the ones below are over my head... 
 
Why could we not simply have used as this is what I tried and it didn't work  
? 
<h3>.*?(?P<field2>[0-9\.,]*).*?</h3> 
 
 
> 
> advanced, will validate currency format: 
> <h3>.*?(?P<field2>(?:[1-9][0-9]{0,2}(?:,[0-9]{3})*|0)(?:\.[0-9]{2})?).*?</h3 
>> 
> 
> allow for unexpected html tags/attributes, where we don't want to match  
> the 
> '10' in a '<span margin="10px">' for instance: 
> <h3[^>]*>(?:[^<]*?(?:<[^>]*>)?)*?(?P<field2>(?:[1-9][0-9]{0,2}(?:,[0-9]{3})* 
> |0)(?:\.[0-9]{2})?).*?</h3> 
> 
> Offcourse, if you're naming your captures 'field1' & 'field2', you might  
> as 
> well not name them at all. 
 
This was simply to help illustrate where the fields were in the regex 
 
> 
> Grtz, 
> --  
> Rik Wasmus 
> 
>
 
  
Navigation:
[Reply to this message] 
 |