You are here: Re: [PHP] geographic search engine « PHP « IT news, forums, messages
Re: [PHP] geographic search engine

Posted by Richard Lynch on 10/04/65 11:05

Brian Dunning wrote:
> On Jan 11, 2005, at 12:39 PM, Greg Donald wrote:
>
>> This does not take into account the curve of the earth. In addition
>> you'll need a db with the latitude and longitude for each zip code.

If you only want "close" things to a given location, the curvature of the
earth is pretty insignificant.

I used to have that long-ass trigonometric formula in a user-defined
PostgreSQL function so I could implement geogrphical search here:
http://chatmusic.com/venues.htm#search

I realized one day that at a distance of a hundred miles or less, I just
didn't *CARE* about curvature of the earth, and replaced that trig with
your basic Cartesian distance.

Made a *HUGE* difference in performance.

It always amazes me that I was so foolish as to listen to all the
"experts" who told me I needed this really hairy complex thing when, in
reality, what I needed was something MUCH simpler.

Oh, and another Free Tip:

Once you load in the 65000+ zip codes for the US (plus whatever other
countries you need/find) you'll be doing a JOIN with your search table
with that.

65000 X (number of records in your table) ====> WAY TOO MANY RECORDS!!!

So it's time to break the cardinal rule of good database design.

Sort of.

Sanely.

Put a longitude and latitude column on your existing table, and default it
to NULL.

Write your business logic so that *ANY* time a zip code is changed, the
longitude/latitude is re-set to NULL. (Or use triggers or whatever you
want to make this happen. I don't care.)

Finally, write a cron job (scheduled task in Windoze) to find N records at
random in your table where the long/lat is NULL, and *copy* over the
long/lat from the zips table.

Now, your search only has to deal with however many records are in your
table. Not 65000 *times* that many.

You're breaking the "rules" of database design in a sensible, maintainable
way for a HUGE performance gain.

Before I figured this out, I must have brought my server to a crawl I
don't know how many times with a perfectly "reasonable" query... that
involved millions of intermediary tuples when I really only expected a
dozen to actually come out in the end.

PPS You can pay $$$ for the "complete" databases of zips, or use the TIGER
data for free just interpolate from existing entries to make up long/lats
for new zips. EG: If tomorrow the USPS creates zip code 60609, I can be
pretty damn sure it's "close enough" to 60601 through 60608 and just
average them to make up bogus long/lat. Sure, it's "wrong". It's also
FREE and "close enough" for what I (and almost for sure you) are doing.

--
Like Music?
http://l-i-e.com/artists.htm

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация