|  | Posted by deciacco on 02/09/07 20:46 
How about:
 if
 (preg_match('/\\b(https?|ftp|file):\/\/[-A-Z0-9+&@#\/%?=~_|!:,.;]*[-A-Z0
 -9+&@#\/%=~_|]/i', $subject, $result)) {
 $url = $result[0];
 } else {
 $url = "";
 }
 
 -----Original Message-----
 From: deko [mailto:deko@nospam.com]
 Posted At: Friday, February 09, 2007 2:15 PM
 Posted To: comp.lang.php
 Conversation: Best way to extract URL from random string?
 Subject: Best way to extract URL from random string?
 
 If I have random and unpredictable user agent strings containing URLs,
 what is
 the best way to extract the URL?
 
 For example, let's say the string looks like this:
 
 registered NYSE 943 <a href="http://netforex.net"> Forex Trading Network
 
 Organization </a> info@netforex.org
 
 What's the best way to extract http://netforex.net ?
 
 I have code that checks for identifiable browsers and bots, but when the
 agent
 string has no identifiable information other than a URL, I want to grab
 the URL.
 
 Here's a first crack at it:
 ..
 ..
 ..
 [code omitted]
 ..
 ..
 ..
 elseif (eregi("http://", $agent))
 {
 $agent = stristr($agent, "http://");
 $agent = parse_url($agent);
 $agent = $agent['host'];
 //check for subdomains
 $agent_a = explode(".", $agent);
 $agent_r = array_reverse($agent_a);
 $sub = count($agent_r) - 1;
 $tld3 = substr($agent_r[0], 0, 3);
 if (eregi("^(com|net|org|edu|biz|gov)$", $tld3)) //common tld's
 {
 while ($sub > 0)
 {
 $domain = $domain.$agent_r[$sub].".";
 $sub--;
 }
 $refurl = $domain.$tld3;
 }
 $referrer = "<a href='".$refurl."'>".$refurl."</a>";
 }
 else
 {
 $referrer = "unknown";
 }
 
 Are there any PHP functions that will help here?  How to handle sub
 domains?
 International domains?
 
 Thanks in advance.
  Navigation: [Reply to this message] |