Reply to Best way to extract URL from random string?

Your name:

Reply:


Posted by deko on 02/09/07 20:15

If I have random and unpredictable user agent strings containing URLs, what is
the best way to extract the URL?

For example, let's say the string looks like this:

registered NYSE 943 <a href="http://netforex.net"> Forex Trading Network
Organization </a> info@netforex.org

What's the best way to extract http://netforex.net ?

I have code that checks for identifiable browsers and bots, but when the agent
string has no identifiable information other than a URL, I want to grab the URL.

Here's a first crack at it:
..
..
..
[code omitted]
..
..
..
elseif (eregi("http://", $agent))
{
$agent = stristr($agent, "http://");
$agent = parse_url($agent);
$agent = $agent['host'];
//check for subdomains
$agent_a = explode(".", $agent);
$agent_r = array_reverse($agent_a);
$sub = count($agent_r) - 1;
$tld3 = substr($agent_r[0], 0, 3);
if (eregi("^(com|net|org|edu|biz|gov)$", $tld3)) //common tld's
{
while ($sub > 0)
{
$domain = $domain.$agent_r[$sub].".";
$sub--;
}
$refurl = $domain.$tld3;
}
$referrer = "<a href='".$refurl."'>".$refurl."</a>";
}
else
{
$referrer = "unknown";
}

Are there any PHP functions that will help here? How to handle sub domains?
International domains?

Thanks in advance.

[Back to original message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация