Posted by Toby Inkster on 12/12/06 16:29
Erwin Moller wrote:
> You have a certain value that you transform to a md5 and check if it is your
> db allready, right?
No -- he has *hundred of thousands* of "certain values" that he needs to
transform to an MD5 and check to see if it's in his DB already.
What you are suggesting is (pseudo-code abound):
$needles = (hundreds of thousands of values);
foreach ($needles as $n)
{
$nMD5 = md5($n);
$r = sql_query("SELECT md5text FROM logsfull WHERE md5text='$nMD5';");
if (sql_fetch_array($r))
print "$nMD5 exists in database.\n";
else
print "$nMD5 does not exist in database.\n";
}
This will involve hundreds of thousands of SQL queries. Say the "logsfull"
table has zero rows (as it well might!), then that is hundreds of
thousands of useless calls to your RDBMS.
What I am suggesting is:
$needles = (hundreds of thousands of values);
$r = sql_query("SELECT md5text FROM logsfull ORDER BY md5text;");
while (list($straw) = sql_fetch_array($r))
$haystack[] = $straw;
foreach ($needles as $n)
{
$nMD5 = md5($n);
if (bsearch($haystack, $n))
print "$nMD5 does exist in database\n";
else
print "$nMD5 does not exist in database\n";
}
A single call to our poor beleagured RDBMS, and then an efficient binary
search for each needle. The bottleneck is likely to be the md5() function
here.
--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
[Back to original message]
|