|
Posted by Petr Smith on 10/10/05 11:10
>>I'm writing template
>>library based on XML. But it's not very efficient to create new
>>DomDocument, load XML template, process it and show on every page hit.
>>XML parsing is not very fast, and because I'm parsing XHTML with
>>entities, all DTD's are parsed too. I thought about something similar
>>to
>>java - there I can have servlet which lives all the time the server
>>lives. It can load XML and parse it only for the first time and send
>>DOM
>>objects to another servlets.
>>I need something similar with PHP, can it be done?
>
>
> I think you might want to avoid trying to do it the Java way in PHP.
>
> PHP is share-none by architectural design, not accident, so that you
> can scale up by throwing as much cheap/stock hardware at it as you can
> afford instead of being forced to buy a single bigger hardware box in
> the center for the shared data.
>
> It would probably make a lot more sense to store whatever you use to
> uniquely identify your XML source and the results in a database or
> filesystem, and then compare time-stamps in some simple business logic
> to decide to re-parse or serve from cache.
>
> Yes, that does just foist off the shared-data to the database, or
> file-system -- but those systems are specifically designed to handle
> this task for a long time now with a lot of heavily tested and
> optimized code. PHP and even Java can't really match that level of
> testing/optimization yet simply due to relative ages.
>
> If db and filesystem are "too slow" or you already have too many
> machines running this code-base, you could write your own PHP "XML
> cache server" that takes an XML id and either gets it from the db/file
> cache, or parses the true original, and set up your own "server" for
> this express purpose and really make it scream on speed... That's
> quite a bit of work, though, and for the simplicity of the code
> involved, you may be better off writing it as a C application... Or
> out-sourcing that bit of code to be written with specific timing
> targets for the employee to meet/beat to get their just due $$$.
>
Thanks for lot of ideas, you are probably right I'm trying to think it
the "java way".
But my main bottleneck is the XML parsing part, so I was trying to avoid
it somehow. It is also more slower because my XML is not "normal" XML,
but XHTML file so I need to have resolveExternals=true to parse files
with XHTML entities ( < etc.)
So I cannot cache my final objects to files or database, because it
involves some sort of serialization and later (when accessing the cache)
some unserialization (the slow parse part).
That was the reason I thought about caching in memory.
I'm sure I can setup some "XML cache server" but again, how will I
exchange data with it? I cannot move all object trees between servers
(XML files couldn't be serialized).
There is last chance to write some C extension, but why use PHP then? I
can write it all in C, also with my own HTTP server :)
I think there should be some way to have all objects (including PHP
internal) stored somewhere in memory and "living" all the time the web
server lives. It solves many types of problems.
Petr
[Back to original message]
|