You are here: Re: Regex help « PHP Programming Language « IT news, forums, messages
Re: Regex help

Posted by Steve on 10/15/07 13:03

"Jerry Stuckle" <jstucklex@attglobal.net> wrote in message
news:K-qdnTSkY4NaoI7anZ2dnUVZ_j6dnZ2d@comcast.com...
> Steve wrote:
>> "Jerry Stuckle" <jstucklex@attglobal.net> wrote in message
>> news:KaadnQnnGt0WT4_anZ2dnUVZ_tajnZ2d@comcast.com...
>>> OK, I give up here. I am DEFINITELY not a Regex expert, and have been
>>> working on this for hours with no luck.
>>>
>>> Basically I need to parse a page for certain information which will be
>>> fed back into CURL to post to a site. I need to find four types of tags
>>> on the page:
>>>
>>> <input type=hidden name=a1 value=b1>
>>> <input type=text name=a2>
>>> <input type=submit name=a3 value=b3>
>>> <select name=a4>
>>>
>>> I don't need any other tags.
>>>
>>> From the hidden and submit types, I need name and value. From the text
>>> and select types, I just need the name.
>>>
>>> I can assume the attributes will always show up in this order, but there
>>> may be other things between the < and > delimiters. Additionally, the
>>> actual type and name may have single or double quotes around them, or
>>> neither.
>>>
>>> Does anyone have some code for this? It doesn't have to be all one
>>> regex.
>>
>> alright, jer. let's see what we can do...
>>
>> here's an eyeballed attempt:
>>
>> <(select\s?[^>].*?)|(input\s[^t]*?type\s*?=\s?('|"|\s)(hidden|text|submit)\3[^>].*?)>
>>
>> to keep it easier, i'd think about using that to get your general
>> matches. iterating through those, i'd apply another regex to break out
>> the name, type, and value. you could very well catch it all in the above,
>> however, it's not as straightforward and hence, not easily maintained. if
>> you need additional help on writing this, let me know. i'll psuedo-code
>> the whole enchillada if you want. this should be sufficient in getting
>> only those tags you listed above...which is a good start.
>>
>> btw, make the seach caseINsensitive.
>
> Hi, Steve,
>
> Yep, it's a start. Some problems (output below), but I think it will get
> me a little farther.
>
> And you're right, I already gave up on getting everything in one pass. I
> was thinking of trying to just get everything for a single element type
> (i.e. all <input type=text ...> elements), but this gives me another idea,
> also.
>
> And the output from the first try:
>
> Array
> (
> [0] => Array
> (
> [0] => <select n
> [1] => <select n
> [2] => <select n
> )
>
> [1] => Array
> (
> [0] => select n
> [1] => select n
> [2] => select n
> )
>
> [2] => Array
> (
> [0] =>
> [1] =>
> [2] =>
> )
>
> [3] => Array
> (
> [0] =>
> [1] =>
> [2] =>
> )
>
> [4] => Array
> (
> [0] =>
> [1] =>
> [2] =>
> )
>
> )

well, that's no so good a start! i'll break out the old regex ide and fix
that...if you want.

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация