|
Posted by Norman Peelman on 12/12/06 15:15
"Jerry Stuckle" <jstucklex@attglobal.net> wrote in message
news:Asednf6PSqV6X-PYnZ2dnUVZ_qjinZ2d@comcast.com...
> Norman Peelman wrote:
> > <Cleverbum@hotmail.com> wrote in message
> > news:1165846992.423101.215040@f1g2000cwa.googlegroups.com...
> >
> >>I currently have a list of md5 strings and need to check if a new
> >>string is in that list hundreds of thousands of times. I've found that
> >>the fastest way to do this is to have all the md5's stored in an array
> >>and use the php function in_array().
> >>my only problem now is that populating this array with data from my sql
> >>server is rather slow, I currently use the lines:
> >>
> >>$resone = mysql_query("SELECT * FROM logs_full");
> >>mysql_close();
> >>
> >>while ($row = mysql_fetch_array($resone)) {
> >> $md5array[$md5count]= $row['textmd5'];
> >> $md5count++;
> >>}
> >>
> >>to do this. does anyone have a faster method?
> >>
> >
> >
> > Maybe,
> >
> > I think you are going about this project all wrong:
> >
> > Assumption - your 'list' of md5's is actually the md5's in the database.
> > Assumption - you arecreating md5's from strings in text files.
> > Assumption - you need to check to make sure that the new value isn't
already
> > in the database.
> >
> > Problem - You are reading in the ENTIRE database.
> > Problem - you may be tempted to think in_array() would be faster, but it
has
> > to start at the beginning of the array for each new value every time.
> > Problem - big waste of time and resources. You are doing double work.
> >
> > Solution - Let MySQL do what it was designed to do. Since md5's are
meant to
> > be unique in their own right, simply make your 'textmd5' field the
PRIMARY
> > KEY and it will automatically be indexed. Now only do the operations you
> > actually require:
> >
>
> Actually, md5's are NOT unique. If they were, it would be the most
> impressive compression mechanism known to computers. Imagine - any size
> file compressed to 32 characters!
>
Well, I knew i'd get some flak for that. :) I was assuming they would be
unique enough for the situation.
> > $query = "SELECT textmd5 FROM logs_full WHERE textmd5 = 'search_md5';
> > $resone = mysql_query($query, $dbc);
> > if (!mysql_num_rows($resone))
> > {
> > // no match found
> > // insert new info into database
> > }
> > else
> > {
> > //match found
> > // no need to insert
> > }
> >
> > no arrays being used and I guarantee it will be WAY faster for any size
of
> > database. Now for the real question. Any particular reason you are
creating
> > an md5 database? It's already being done...
> >
> > Norm
>
Norm
--
FREE Avatar hosting at www.easyavatar.com
Navigation:
[Reply to this message]
|