|
Posted by McHenry on 06/26/06 10:59
I would like a regex for the following html snippet:
<h2>Field1</h2>
<h3>
$123,456.78 - $987,654.32
</h3>
I would like to capture Field1 and the first numeric value only.
I have created the following that works somewhat:
$pattern='%<h2>(?P<field1>.*?)</h2>
.*?
<h3>.*?\$(?P<field2>.*?)\s.*?</h3>
%six';
However I would like to improve field2's capture to be the first series of
numbers after <h3> excluding the thousand seperator and stop the capture as
soon as a non numeric is encountered other than the decimal point, I cannot
depend on the dollar sign always being present, so in this case I'd capture
123456.78
Thanks in advance...
Navigation:
[Reply to this message]
|