|
Posted by WombatBoy on 10/17/73 11:25
I've been doing something similar myself, but wanted to avoid the chance of
getting an accidental early string match.
The strpos() function will let you locate a string within another string
(I'm assuming here that you've got the whole html page as a single string),
and, if required, you can specify a starting position.
So something like
$p1 = strpos($rec,"</header>");
would let you get beyond the html header, then
$p2 = strpos($rec," by ",$p1);
would let you find the first occurrence of " by " beyond position $p1 (or
maybe "by<", depending whether there's a space there or not)
then you can search for <b> and </b> in the same way, adjust your sums a
bit, and get
$author = substr($rec,$start,$length);
where $start will probably be something like $p1+3 and $length something
like $p2-$p1-2, or whatever it turns out to be, and whichever way round $p1
and $p2 end up.
Hope this helps. As an alternative you might try the explode function using
" by " as the string to split $rec on, and then check each array element.
"Epetruk" <nobody@blackhole.com> wrote in message
news:3njvqpF1sm7fU1@individual.net...
> Hi,
>
> I'm having to modify a PHP script even though I have little knowledge of
> PHP
> itself. The script extracts specific strings from an html file, and I need
> to it extract some further information.
>
> Specifically, each file represents an article written by an author. The
> author's name is typically preceded by a 'By' or a 'by', then it goes on
> till there's a carriage return.
>
> So for example, the file might contain something like this:
>
>
> The Need For Regeneration
>
> by <b>John Smith</b>
>
> We have seen the waste that has been produced....
>
> (rest of article)
>
>
> or
>
>
> How To Make Lots and Lots of Money Writing PHP
>
> by The Supreme Coder
>
> The first thing you need to know about making money is...
>
> (rest of article)
>
>
> So I need code that will start searching the file from the beginning for
> the
> words 'by ' or 'By ', then grab everything that follows that until it gets
> to a new line and assign that to a variable. In the examples I have given
> above, it would grab '<b>John Smith</b>' and 'The Supreme Coder'. I've
> seen
> a function called preg_match which might do the job, but it uses regular
> expressions which I have little knowledge of.
>
> Would any person be so kind as to post what arguments I would need to call
> this function with?
>
> TIA,
>
> --
> Akin
>
> aknak at aksoto dot idps dot co dot uk
>
>
Navigation:
[Reply to this message]
|