Posted by suzanne.boyle on 11/25/07 21:48
Hi
I have an html file with headings followed by one or more paragraphs
like this
<h2>blah blah 1</h2>
<p>more blah blah blah</p>
<h2>blah blah 2</h2>
<p>more blah blah blah</p>
<p>even more blah blah blah</p>
I'd like to extract the text of the headings and the related
paragraphs and insert them into a database. So far I've managed to
get the heading text but cant figure out how to get the associated
paragraphs. I've been using regular expressions, here is the
expression I have so far <h2[.]*>(.+?)</h2>(.+?). This gets the text
of the headings but not the paragraphs and now I'm basically stumped.
Any help would be appreciated.
Navigation:
[Reply to this message]
|