|
Posted by Jim Michaels on 02/16/06 09:07
"Jasen Betts" <jasen@free.net.nz> wrote in message
news:3f72.43ef07a3.18168@clunker.homenet...
> On 2006-02-12, Jasen Betts <jasen@free.net.nz> wrote:
>> On 2006-02-11, MrBiggles <mrbiggles909@yahoo.com> wrote:
>>
>>> I read in a csv file with 60000 lines (20 fields per record), and store
>>> the data to a local array. The file is read in and stored just fine and
>>> pretty quick.
>>> Now, if I try to assign that array to a session variable it chokes.
>>> e.g. create array and load each element with a row from the file (btw,
>>> each row is an array as well, using fgetcsv()). When local array is
>>> loaded, I assign to session var as so:
>>> $_SESSION['mydata'] = localArray;
>>
>> Here I'm limited to something like 3.2K much more than that and the
>> session
>> data is lost.
>
> sorry, no that was a mozilla bug not displaying lines with that many
> non-space characters... I've tested sessions upto 2M here - that's about
> the
> limit for this old hardware...
>
>
> hmm...
>
> I hear (and see in my config file) that PHP gets (by default) 8M of ram
> to play with.
>
> 60K x 20 is 1200K fields
> there's only room for an about 8 bytes per field, four of which are
> probably going to be a pointer of some sort there's probably another four
> needed for memory allocation control, or a reference counte etc... looks
> like you're doomed before you even start.
struct {
union {
long lval;
double dval;
struct {
char *val;
int len;
} str;
HashTable *ht;
zend_object_value obj;
} value;
zend_uint refcount;
zend_uchar type;
zend_uchar is_ref;
} zval;
so the sizeof(double)+sizeof(unsigned int)+sizeof(unsigned
char)+sizeof(unsigned char)
I forgot my C!
I think it's 8+4+1+1.14 bytes to start with, + the sizeof(the HashTable)
when that's all done.
not much detail there. the details about the internal structure of a
hashtable weren't given. sorry. But I suspect this struct is only necessary
once, not repeated every array element, due to the refcount. a refcount is
only needed by the parser/interpreter for certain purposes (like language
"references" & mainly keeping track of pointer counts under the hood).
>
> I did some testing:
>
> $a=array();
> for($c=200;$c<60000;$c+=200)
> {
> for($b=$c-200;$b<$c;$b++)
> {
> $a[$b]=array('1','2','3','4','5','6','7','8','9','0',
> 'a','b','c','d','e','f','g','h','i','j');
> print('.');
> }
> print '<br>' .$c; flush();
> };
>
>
> I get about 5500 rows by 20 elements by 1 character each before I exceed
> that limit
>
> if you bump the limit up to say 400M you might have some success
> (dependant on record sizes).
>
> that means your server wants around 40G of ram to be able to handle 100
> simultaneous requessts... (is 100 a reasonable figure??)
>
> ISTM it might be time to re-evaluate the task and either do without the
> huge
> array, or store and process it using some other language (like a databaee
> using SQL or a custom app using C, or a combination oof the two)
>
> Bye.
> Jasen
[Back to original message]
|