|
Posted by Tom on 04/27/07 17:34
On 27 Apr 2007 04:10:50 -0700, Captain Paralytic wrote...
>
>On 26 Apr, 21:50, Christoph Burschka <christoph.bursc...@rwth-
>aachen.de> wrote:
>> Blagovist wrote:
>> > Virginner wrote:
>> >> "Blagovist" <b...@ovist.com> wrote in message
>> >>news:462f0f0f_3@x-privat.org...
>> >>> Hi.
>> >>> Is there an easy way to "lift" data from HTML tables and enter that
>> >>> into my database? I'm a total novice and so far my searches have
>> >>> yielded little. I see Navicat has an import option, but that appears
>> >>> to be for well structured data like Word, Excel or PDF...
>>
>> >>> Thanks,
>>
>> >>> Blago
>>
>> >> If you've got Excel, then you can "bounce" a table via that (copy /
>> >> paste) then use that to import via Navicat....
>>
>> >> D.
>>
>> > I found something called easywebsave (an IE add-on) that looks
>> > promising. But still a long way from being automated.
>>
>> > Blaqgo
>>
>>The following code relies heavily on your input html table being well-formatted
>> XHTML:
>>
>> $text = "<table> [your table here] </table>";
>>
>> /* first, strip the first and last tr tags.
>> preg_match('/<tr[^>]*>(.+)<\/tr>/',$text,$match);
>> $to_split=$match[1];
>>
>> /* now split wherever a row is closed, then opened. */
>> $rows = preg_split('/<\/td>.*?<\/tr>.*?<tr[^>]*>.*?<td[^>]>/',$to_split);
>>
>> foreach ($rows as $row)
>> {
>> // now split the rows into cells.
>> $cells[]=preg_split('/<\/td>.*?<td[^>]*>/',$row);
>>
>> }
>>
>>Your data is now split in a two-dimensional array. Putting it into a database is
>> pretty trivial after that.
>>
>> --
>> cb- Hide quoted text -
>>
>> - Show quoted text -
>
>But what if that data had individual formatting. The data in one cell
>could have a superscript or be in bold. All those tags would be
>included.
>
Exactly. There are all kinds of HTML tags that can be sandwiched between the TD
tags. After clearing the TD tags, might be worth checking for <font>, <b>,<a
href>, etc. and clear those out too.
Tom
--
Newsguy.com
90+ Days Retention
Higher levels of article completion
Broader newsgroups coverage
Navigation:
[Reply to this message]
|