|
Posted by petersprc on 09/28/07 12:01
Based on your description, I would assign a canonical name to each
product in the database and then match against that.
For items not matched in this step, you can try using the metaphone
algorithm on the canonical name. This can help detect some types of
spelling errors.
Any remaining items with no unique match could be handled manually.
<?
function canonicalName($name)
{
$name = preg_replace('/[^0-9a-z\s]/i', '', $name);
$words = preg_split('/\s/s', $name, -1, PREG_SPLIT_NO_EMPTY);
sort($words);
return join(' ', $words);
}
$cName = canonicalName("Barbie Funhouse");
$metaphone = metaphone($cName);
echo "cName = $cName, metaphon = $metaphone\n";
$cName = canonicalName("Funhouse Barbie!");
$metaphone = metaphone($cName);
echo "cName = $cName, metaphon = $metaphone\n";
?>
The above outputs:
cName = Barbie Funhouse, metaphone = BRBFNHS
cName = Barbie Funhouse, metaphone = BRBFNHS
You can update each product with a cName and metaphone field.
On Sep 28, 5:07 am, "tower....@gmail.com" <tower....@gmail.com> wrote:
> Hello.
>
> I have gotten the problem.
> I have two tables with products in mysql database. I need to code php
> script that will connect products by name.
> But names can be some different (like another words order, or
> additional spaces, commas, dephises ect.)
>
> Can someone recommend me php class of show some example how to do
> this?
>
> Thanks.
Navigation:
[Reply to this message]
|