|
Posted by Tim on 06/09/06 21:21
vito wrote:
> for an array element $fields[$j] containing:
>
> gb:AF367205.1 DB_XREF=gi:17225988 TID=Os.1005.1 CNT=36 FEA=FLmRNA
> TIER=FL+Stack STK=9 UG=Os.1005 DEF=Oryza sativa 1-deoxy-D-xylulose
> 5-phosphate reductoisomerase precursor, mRNA, complete cds; nuclear gene for
> plastid product. PROD=1-deoxy-D-xylulose 5-phosphate
> reductoisomeraseprecursor FL=gb:AK059692.1 gb:AK099702.1 gb:AF367205.1
> REP_ORG=O. sativa
>
> i try to extract useful content by:
>
> if (preg_match("/PROD=(.+)\s{2}/", $fields[$j], $match ) )
> $fields[$j] = $match[1];
> else if (preg_match("/UG_TITLE=(.+)\s{2}/", $fields[$j], $match ) )
> $fields[$j] = $match[1];
> else if (preg_match("/DEF=(.+)\s{2}/", $fields[$j], $match ) )
> $fields[$j] = $match[1];
>
>
> i have confirmed it is 2 spaces (i.e. not tab, linefeed, new line). i just
> don't know why sometimes it gives me:
>
> PROD=1-deoxy-D-xylulose 5-phosphate reductoisomeraseprecursor
> FL=gb:AK059692.1 gb:AK099702.1 gb:AF367205.1
>
> or more (i.e. run-on matching). i don't know if it deals anything with
> matching as much as possible.
Yeah thats it.
(.+)\s{2}
Is a greedy match, it will match to the last two spaces it finds and
not the first.
Adding a ? after the brackets make it a non greedy match.
(.+)?\s{2}
Tim
Navigation:
[Reply to this message]
|