|
Posted by Christoph Burschka on 04/28/07 12:14
Captain Paralytic wrote:
> On 26 Apr, 21:50, Christoph Burschka <christoph.bursc...@rwth-
> aachen.de> wrote:
>> Blagovist wrote:
>>> Virginner wrote:
>>>> "Blagovist" <b...@ovist.com> wrote in message
>>>> news:462f0f0f_3@x-privat.org...
>>>>> Hi.
>>>>> Is there an easy way to "lift" data from HTML tables and enter that
>>>>> into my database? I'm a total novice and so far my searches have
>>>>> yielded little. I see Navicat has an import option, but that appears
>>>>> to be for well structured data like Word, Excel or PDF...
>>>>> Thanks,
>>>>> Blago
>>>> If you've got Excel, then you can "bounce" a table via that (copy /
>>>> paste) then use that to import via Navicat....
>>>> D.
>>> I found something called easywebsave (an IE add-on) that looks
>>> promising. But still a long way from being automated.
>>> Blaqgo
>> The following code relies heavily on your input html table being well-formatted
>> XHTML:
>>
>> $text = "<table> [your table here] </table>";
>>
>> /* first, strip the first and last tr tags.
>> preg_match('/<tr[^>]*>(.+)<\/tr>/',$text,$match);
>> $to_split=$match[1];
>>
>> /* now split wherever a row is closed, then opened. */
>> $rows = preg_split('/<\/td>.*?<\/tr>.*?<tr[^>]*>.*?<td[^>]>/',$to_split);
>>
>> foreach ($rows as $row)
>> {
>> // now split the rows into cells.
>> $cells[]=preg_split('/<\/td>.*?<td[^>]*>/',$row);
>>
>> }
>>
>> Your data is now split in a two-dimensional array. Putting it into a database is
>> pretty trivial after that.
>>
>> --
>> cb- Hide quoted text -
>>
>> - Show quoted text -
>
> But what if that data had individual formatting. The data in one cell
> could have a superscript or be in bold. All those tags would be
> included.
>
Hopefully, that information is in the style attribute of the cell tag (and will
get split away, since <td[^>]*> matches a complete tag with all attributes). But
if there's markup inside the cell, strip_tags() will remove it.
--
cb
Navigation:
[Reply to this message]
|