|  | Posted by The Natural Philosopher on 09/02/07 01:24 
The Natural Philosopher wrote:> Gary L. Burnore wrote:
 >> On Sat, 01 Sep 2007 21:26:46 +0200, "Rik Wasmus"
 >> <luiheidsgoeroe@hotmail.com> wrote:
 >>
 >>> On Sat, 01 Sep 2007 20:39:06 +0200, The Natural Philosopher <a@b.c>
 >>> wrote:
 >>>
 >>>> Rik Wasmus wrote:
 >>>>> On Sat, 01 Sep 2007 13:13:33 +0200, Rik Wasmus
 >>>>> <luiheidsgoeroe@hotmail.com> wrote:
 >>>>>
 >>>>>> On Sat, 01 Sep 2007 11:42:59 +0200, The Natural Philosopher
 >>>>>> <a@b.c>  wrote:
 >>>>>>>>  If you really want a chunked upload check out the user comments
 >>>>>>>> at  <http://nl3.php.net/manual/en/function.readfile.php>
 >>>>>>>>
 >>>>>>> Are you dyslexic, I want to DOWNLOAD. Upload I have done already.
 >>>>>>
 >>>>>> bah, 't was late last night... I meant to say download, allthough
 >>>>>> it  depends wether you think usercentric or usercentric which is
 >>>>>> which :P
 >>>>>  Hmmmz, haven't made a full recovery yet I see, "usercentric or
 >>>>> servercentric"...
 >>>>>
 >>>> its Saturday, Get a bottle of decent spirits and relax. ;-)
 >>>>
 >>>> Anyway I have enough onfo op spec out that part of e jobn. Muy
 >>>> necxt  problem is wheher
 >>> Hehe, taking you own advice? :P
 >>>
 >>>> its more efficient to have 5000 files all called 00001,
 >>>> 00002...05000 in  one directory, or whether to split them up over
 >>>> several..and whether to  keep their names and extensions intact, or
 >>>> just be lazy, knowing the   data base knows what they were called.
 >>> Hmm, depends on the file system, not really an expert there. Jerry
 >>> would  tell you to just serve them up straight from the database, and
 >>> forget  about the filesystem, I'm not so sure :). You can do a little
 >>> testing  offcourse, see what works best.
 >>
 >> The more files you store in one directory, the harder the OS has to
 >> work to list through them.
 > Ah, but the more subdirectories you have in a file system, the more work...
 >
 > ;-)
 >
 > I.e. I think that the case where each file is in its own subdirectory is
 > of similar order to no subdirs at all.
 >
 > I suspect the answer is that for n files, us the pth root of n as the
 > number of subdirs, where p is the depth of the subdirs...but a lot
 > depends on caching algorithms in the directories. AND the way the OS
 > searches them.
 >
 > I don't know what a directory entry is these days..256 bytes? well
 > 10,000 of them is only 2.56Mytes or so. Should be well within cache
 > range. let's say it has to do maybe 1000 machine cycles for every
 > byte...thats 500K bytes a second searched at 500Mhz..4 seconds for a
 > linear search to find the last one. Mm. That's significant.
 >
 > whereas a two level hierearchy? a 100 dirs in the first, and 100 files
 > in each? 80 milliseconds. Give or take...starting to get into disk
 > latency times anyway..
 >
 > well I have no idea how the OS searches the directory, but it seems to
 > me that a hundred dirs of a hundred files each has to be the way to go.
 >
 > Shouldn't be that hard either: just use an autoincrement on each tag
 > record, and every time it gets to modulo one hundred create a new
 > directory.
 >
 >
 >
 >
 further info: Post some research EXT3 filesystems can have 'dir_index'
 set which adds a hashed Btree into the equation. Mine appear to have
 this..i.e a lot of what splitting the big directory into smaller ones
 would do, is already done by this mechanism.
 
 I suspect this means that retrieval time using the database itself, and
 a ext3 hashed file structure, are pretty similar.
 
 Ergo the answers would seem to be
 
 - On a hashed treed fileystem there is little advantage to going to
 subdirs provided you KNOW THE FILE YOU WANT.
 - Access to a database LONGBLOB is probably no faster, and not much slower.
 - however the real issue is the file storage time into the database
 itself. And how PHP might cope with super large 'strings'
 
 Here is an interesting fragment..complete with at least one error..they
 don't delete the temporary file..
 
 http://www.php-mysql-tutorial.com/php-mysql-upload.php
 
 I shudder to think how long adding slashes to a 60Mbyte binary image
 might take, or indeed how much virtual memory holding it in a 60Mbyte
 php string might use..
 
 Anyone ever handled BLOB objects that large, into a Mysql database?
 
 I guess it's test time...it WOULD be kinda nice to store it all in the
 database.
 
 Lets see..whats an A0 image..48x24 inches - at 24 bit 300dpi? Hmm
 Uncompressed, about 60Mbyte..
 
 I guess I could set a limit, and force them to compress down to 10Mbyte
 or so..
 
 PHP appears to set no intrinsic limit on string sizes, so reading the
 whole thing into a string is at least feasible in theory. What worries
 me is the addslashes bit. all of which is then removed by the mysql
 API..yuk. so instead of a byte for byte copy, there has to be a 'read
 file into ram, create new slashed string and deallocate memory from
 first, give to sql, create de-slashed BLOB in RAM, write to ISAM file,
 deallocate memory,do garbage..etc etc..
 
 What would be better is to write to a BLOB from a file directly.
 
 LOAD DATA might work..not quite sure how..insert a record, and load data
 into just the blob?
 
 The MySQL server also has restrictions on the maxiumum data size it can
 accept over the socket ..hmm. Mine's latest and is 16Mbyte..so thats
 OK..configurable..upwards..
 
 Oh..I think this may work and avoid all this addslashing nonsense
 
 $query= ("UPDATE file_table SET
 file_contents=LOAD_FILE('".$tmpfilname."') where id=$id");
 
 Any comments?
  Navigation: [Reply to this message] |