|
Posted by vito on 06/09/06 12:08
for an array element $fields[$j] containing:
gb:AF367205.1 DB_XREF=gi:17225988 TID=Os.1005.1 CNT=36 FEA=FLmRNA
TIER=FL+Stack STK=9 UG=Os.1005 DEF=Oryza sativa 1-deoxy-D-xylulose
5-phosphate reductoisomerase precursor, mRNA, complete cds; nuclear gene for
plastid product. PROD=1-deoxy-D-xylulose 5-phosphate
reductoisomeraseprecursor FL=gb:AK059692.1 gb:AK099702.1 gb:AF367205.1
REP_ORG=O. sativa
i try to extract useful content by:
if (preg_match("/PROD=(.+)\s{2}/", $fields[$j], $match ) )
$fields[$j] = $match[1];
else if (preg_match("/UG_TITLE=(.+)\s{2}/", $fields[$j], $match ) )
$fields[$j] = $match[1];
else if (preg_match("/DEF=(.+)\s{2}/", $fields[$j], $match ) )
$fields[$j] = $match[1];
i have confirmed it is 2 spaces (i.e. not tab, linefeed, new line). i just
don't know why sometimes it gives me:
PROD=1-deoxy-D-xylulose 5-phosphate reductoisomeraseprecursor
FL=gb:AK059692.1 gb:AK099702.1 gb:AF367205.1
or more (i.e. run-on matching). i don't know if it deals anything with
matching as much as possible. i also tried:
"/DEF=(.+)\s{2}[A-Z].*/" but it still doesn't work. BTW, because i know
sometimes what i want is at the end, so i also use "/DEF=(.+)$|\s{2}/" but
it also doesn't work.
i really appreciate anybody could help on this. i feel really desperated as
i find difficult to find out relevant documentatoin to explain why this
special case fails. thanks a lot
[Back to original message]
|