You are here: Re: [PHP] Re: Filtering URLs problem.. « PHP « IT news, forums, messages
Re: [PHP] Re: Filtering URLs problem..

Posted by Al on 12/23/05 17:06

Jochem Maas wrote:
> Al wrote:
>
>> I didn't fully test this; but it should get you started.
>
>
> fully? more like not at all.
>
> point 1:
>
> "%<a\040href\040*=['"]$types://((www.)*[\w/\.]+)['"]>.+</a>%i";
> ^-- double quotes are not escaped == parse error
>
> point 2:
>
> "%<a\040href\040*=['"]$types://((www.)*[\w/\.]+)['"]>.+</a>%i";
> ^-- this will inject the string 'Array' into the regexp
> string
>
>
> point 3:
>
> the regexp does not take into account that HTML tag attributes can
> occur in any order e.g:
>
> <a class="mine" id="abc123" target="_top" href="www.bla.com" >
> testing
> </a>
>
>
> point 4:
>
> what happens when the url does not have a protocol specified?
> granted the OP did not actually specify if strings like:
>
> "www.google.com"
>
> should also be considered as a url, so this is not really a valid point.
>
>>
>> $types= array('http', 'ftp', 'https', 'mms', 'irc');
>>
>> $pattern=
>> "%<a\040href\040*=['"]$types://((www.)*[\w/\.]+)['"]>.+</a>%i"; //
>> the "i" makes it non case sensitive
>>
>> if(preg_match($pattern, $URL_str, $match)){
>>
>> $URL= match[1];
>> }
>>
>> else{
>>
>> User did not enter a complete link; do the simple thing
>> }
>>
>>
>>
>> Anders Norrbring wrote:
>>
>>>
>>> I'm writing a filter/parsing function for texts entered by users, and
>>> I've run into a problem...
>>> What I'm trying to do is to parse URLs of different sorts, ftp, http,
>>> mms, irc etc and format them as links, that part was real easy..
>>>
>>> The hard part is when a user has already entered a complete link..
>>> In short:
>>>
>>> http://www.server.tld/page.html
>>> should be converted to:
>>> <a
>>> href='http://www.server.tld/page.html'>http://www.server.tld/page.html</a>
>>>
>>>
>>> That part works fine, but if the user enters:
>>>
>>> <a href='http://www.server.tld/page.html'>click here</a>
>>>
>>> it all becomes a mess... Can somebody please make a suggestion on this?
>>
>>
>>
Jochem's correct. I was in too big a hurry trying to help. It was obvious that Anders was not getting much useful
help. His points 3 and 4 are valid and I was not addressing them because they require more work than I have time to devote.

Here is corrected code. It works with the "Regex Coach". I did not try it with a php script.

$types= (http|ftp|https|mms|irc);

$pattern= "%<a\040href\040*=['\"]$types://((www.)*[\w/\.]+)['\"]>.+</a>%i"; // the "i" makes it non case sensitive

if(preg_match($pattern, $URL_str, $match)){

$URL= match[2];
}

else{

User did not enter a complete link; do the simple thing
}

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация