|  | Posted by Christoph Burschka on 04/28/07 12:14 
Captain Paralytic wrote:> On 26 Apr, 21:50, Christoph Burschka <christoph.bursc...@rwth-
 > aachen.de> wrote:
 >> Blagovist wrote:
 >>> Virginner wrote:
 >>>> "Blagovist" <b...@ovist.com> wrote in message
 >>>> news:462f0f0f_3@x-privat.org...
 >>>>> Hi.
 >>>>> Is there an easy way to "lift" data from HTML tables and enter that
 >>>>> into my database? I'm a total novice and so far my searches have
 >>>>> yielded little. I see Navicat has an import option, but that appears
 >>>>> to be for well structured data like Word, Excel or PDF...
 >>>>> Thanks,
 >>>>> Blago
 >>>> If you've got Excel, then you can "bounce" a table via that (copy /
 >>>> paste) then use that to import via Navicat....
 >>>> D.
 >>> I found something called easywebsave (an IE add-on) that looks
 >>> promising. But still a long way from being automated.
 >>> Blaqgo
 >> The following code relies heavily on your input html table being well-formatted
 >> XHTML:
 >>
 >> $text = "<table> [your table here] </table>";
 >>
 >> /* first, strip the first and last tr tags.
 >> preg_match('/<tr[^>]*>(.+)<\/tr>/',$text,$match);
 >> $to_split=$match[1];
 >>
 >> /* now split wherever a row is closed, then opened. */
 >> $rows = preg_split('/<\/td>.*?<\/tr>.*?<tr[^>]*>.*?<td[^>]>/',$to_split);
 >>
 >> foreach ($rows as $row)
 >> {
 >>   // now split the rows into cells.
 >>   $cells[]=preg_split('/<\/td>.*?<td[^>]*>/',$row);
 >>
 >> }
 >>
 >> Your data is now split in a two-dimensional array. Putting it into a database is
 >> pretty trivial after that.
 >>
 >> --
 >> cb- Hide quoted text -
 >>
 >> - Show quoted text -
 >
 > But what if that data had individual formatting. The data in one cell
 > could have a superscript or be in bold. All those tags would be
 > included.
 >
 
 Hopefully, that information is in the style attribute of the cell tag (and will
 get split away, since <td[^>]*> matches a complete tag with all attributes). But
 if there's markup inside the cell, strip_tags() will remove it.
 
 --
 cb
  Navigation: [Reply to this message] |