You are here: Seemingly weird regex problem « PHP « IT news, forums, messages
Seemingly weird regex problem

Posted by Tim Boring on 01/20/05 19:52

Hello! I'm having an odd regex problem. Here's a summary of what I'm
trying to accomplish:

I've got a report file generated from our business management system
(Progress 4GL), one fixed-width record per line. I've got a php script
that reads in the raw file one line at a time, and "strips" out any
"unwanted" lines (repeated column headings, mostly).

I'm stripping out unwanted lines by looking at the beginning of each
line and doing the following:
1. If the line begins with a non-word character (\W+), discard it;
2. If the line begins with the word "Vendor", discard it;
3. If the line begins with "Loc", discard it;
4. If the line begins with a dash, discard it;
5. Else keep the line and write it to an output file.

The way I've implemented this in code is via the code snippet below.
The problem I'm encountering, however, is that any line that begins with
a word, such as "AKRN", is matching rule #1, thus discarding the line.
This is not what I want, but I'm having difficulty spotting my mistake.

To try to help spot the issue, I put in the if(preg_match("/^\W+/",
$line)) logic, and the weird thing is that this logic isn't outputting
the line beginning with things like "AKRN", yet the same line is getting
caught in the switch statement and being discarded.

Any suggestions?

while (!feof($input_handle))
{
$line = fgets($input_handle);

if (preg_match("/^\W+/", $line))
{
echo "$line\n";
}

switch ($line)
{
case ($total_counter <= 5):
fwrite($output_handle, $line);
$counter++;
$total_counter++;
break;
// Rule #1: non-word character
case preg_match("/^\W+/", $line):
array_push($tossed_lines, $line);
echo "Rule #1 violation\n";
$tossed_counter++;
$total_counter++;
break;
// Rule #2: "Vendor" at beginning of line
case preg_match("/^Vendor/i", $line):
array_push($tossed_lines, $line);
echo "Rule #2 violation\n";
$tossed_counter++;
$total_counter++;
break;
// Rule #3: "Loc" at beginning of line
case preg_match("/^Loc/i", $line):
array_push($tossed_lines, $line);
echo "Rule #3 violation\n";
$tossed_counter++;
$total_counter++;
break;
// Rule #4: dash character at beginning of line
case preg_match("/^\-/", $line):
array_push($tossed_lines, $line);
echo "Rule #4 violation\n";
$tossed_counter++;
$total_counter++;
break;
default:
fwrite($output_handle, $line);
$counter++;
$total_counter++;
break;
}
}

--
Tim Boring
IT Department, Automotive Distributors
Toll Free: 800-421-5556 x3007
Direct: 614-532-4240
E-mail: tboring@adw1.com

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация