Reply to Re: 'grep' a file - output to browser not working

Your name:

Reply:


Posted by Jerry Stuckle on 11/02/07 02:02

Phil wrote:
> On Nov 1, 2:02 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote:
>> Phil wrote:
>>> On Nov 1, 5:30 am, Jerry Stuckle <jstuck...@attglobal.net> wrote:
>>>> Phil wrote:
>>>>> On Oct 31, 9:58 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote:
>>>>>> Phil wrote:
>>>>>>> I should point out that I know this has something to do with the file-
>>>>>>> size. With a small file it works OK. With a large file it fails to the
>>>>>>> browser. In this case, a large file is up to 3.5GB of uncompressed
>>>>>>> ASCII text. There may be up to 100 files to search - I can do that
>>>>>>> part I think, once I get it to actually search the file! :-) Any help
>>>>>>> GREATLY appreciated.
>>>>>>> Probably I should mention that I am not a full-time developer ... I'm
>>>>>>> quite good with shell scripts, REGEX and procedural languages ... I'm
>>>>>>> a hacker at best with PHP :-) I'm more of a systems person.
>>>>>>> On Oct 31, 5:45 pm, Phil <phillip.corch...@gmail.com> wrote:
>>>>>>>> I cannot figure why this works fine at the command-line of the linux
>>>>>>>> server, but will not output anything to the browser, and no errors in
>>>>>>>> the error_log (syslog). The goal here is to have a page to enter a
>>>>>>>> search term and grep or zgrep pattern matches in the file (will be a
>>>>>>>> log file). Probably there is a better way to do this using php, but
>>>>>>>> this is what I was able to come up with using my limited php-
>>>>>>>> knowledge. Can anyone help me debug this?
>>>>>> > I should point out that I know this has something to do with the file-
>>>>>> > size. With a small file it works OK. With a large file it fails to the
>>>>>> > browser. In this case, a large file is up to 3.5GB of uncompressed
>>>>>> > ASCII text. There may be up to 100 files to search - I can do that
>>>>>> > part I think, once I get it to actually search the file! :-) Any help
>>>>>> > GREATLY appreciated.
>>>>>> > Probably I should mention that I am not a full-time developer ... I'm
>>>>>> > quite good with shell scripts, REGEX and procedural languages ... I'm
>>>>>> > a hacker at best with PHP :-) I'm more of a systems person.
>>>>>> (Top posting fixed)
>>>>>> Well, some code would help. But are you possibly running out of memory
>>>>>> and/or execution time? Those are the two main things which cause
>>>>>> problems with large amounts of data but not small.
>>>>>> Anything in your PHP error log?
>>>>> there is squat from logs :(
>>>>> This really seems to be an issue of file size.
>>>>> Even this simple read/echo for a file of only 9200 lines it fails, say
>>>>> nothing of my other
>>>>> files of over 2million lines.
>>>>> I have max_execution set to 3000 sec, so i don't think that is it. I
>>>>> don't see too much in
>>>>> php.ini that controls memory handling. "memory limit" is 32MB and post
>>>>> limit is 8PM (default).
>>>>> This is Apache/2.0.52 and PHP 4.3.9 by the way on a dual dual-core P4
>>>>> 3Ghz with 4GB RAM and CentOS 4.4 (32bit) with a 2.6.9-42.0.3.ELsmp
>>>>> kernel
>>>>> $file="/var/log/httpd/access_log";
>>>>> $handle = @fopen("$file", "r");
>>>>> if ($handle) {
>>>>> echo "<pre>";
>>>>> while (!feof($handle)) {
>>>>> $buffer = fgets($handle, 2048);
>>>>> print_r ($buffer);
>>>>> ob_flush();
>>>>> }
>>>>> echo "</pre>";
>>>>> fclose($handle);
>>>>> }
>>>>> Sorry for Top Post. See GP for original code snip.
>>>> OK, in your php.ini file, ensure you have
>>>> error_reporting=E_ALL
>>>> display_errors=On
>>>> for testing purposes.
>>>> If this is a production system, I don't recommend having
>>>> display_errors=On. Rather, I use log_errors=On and set a log file.
>>>> P.S. Don't mind the trolls here. Someone is starved for attention.
>>> I already have those options set.
>>> Actually, I partly found the answer, and it was a beginners
>>> mistake ... file permissions, as the logs are root-owned and apache/
>>> php runs as apache - a little 'is_readable' helped me out there. Now I
>>> have to decide if i want to have apache run as root, or modify system
>>> log files. It is 'production' system, but it's an enterprise internal
>>> only single-purpose and with Change Control I can do anything I need
>>> for this.
>> Ah, didn't think about the file not being readable. :-)
>>
>> You NEVER want to run Apache as root. If someone hacks your site, they
>> have access to EVERYTHING.
>>
>>> I'd still be interested in quid-pro-quo on using external grep versus
>>> some internal preg_grep (and how to make that one work ... I've been
>>> stumped so far).
>>> Thanks! phil
>> Depends on what you're going to do. If all you're doing is searching an
>> array, just use preg_grep(). Or, you can read a small file into an
>> array with file() and use preg_grep().
>>
>> If you need to do more than one file or a large file, then an external
>> grep is probably better.
>>
>> But if I'm searching, I generally have what I'm looking for in a database.
>
> Well, what I'm searching is sendmail logs on a central syslog for a
> cluster of sendmail servers. I think someday I should look into a
> syslog->DB gateway, but currently what I have is 4 months (rotated)
> archived syslog of about 3.5 million lines per file. Instead of have a
> CustSvc guy place an Ops request to have some one grep logs, I want to
> let them do it themselves with a web page.
>
> Sounds like you think that the php calling grep (or zgrep in this
> case) is the best solution ... and given what I've seen of php
> performance supporting 10x and 100x thousands of connections, I think
> it's agree. This is pretty low volume, but I trust *nix to balance the
> loads with grep commands running with a pipe back to php more than I
> think i trust php to do it all internally.
>

No, I think a database is the best solution. Once a day I'd parse the
logs and add them to the database. And of older entries need to be
deleted, no problem.

3.5M lines is nothing to a good RDB.

> The server is internal only, but I understand your comment about
> apache as root. What I did was changed the maillog files to 755, and
> added 'apache' user to the 'root' group in *nix. Works good.
>

Remember that the majority of hacks come from INSIDE - not outside.
Disgruntled employees, etc.

> Thanks for everything. Here is my current code, lemme know if you see
> anything I've done grossly wrong...
>
> <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
> <HTML>
> <HEAD>
> <TITLE>Search Mail Logs</TITLE>
> </HEAD>
> <BODY>
>
> <?php
> // since we use POST method, get global value into 'test' string
> $test=$_POST['sstr'];
>
> // check if submit button pressed and some search value entered
> if (isset($_POST['submit']) && ! empty($test) )
> {
> // set file and path to search
> $file="/var/log/" . $_POST[d_file];
>
You didn't verify the file name. *NEVER* trust user input. It could be
something like "../../var/log/syslog".

> // check the file is readable
> if (! is_readable ($file)) {
> print "ERROR accessing file $file, please contact the
> admin.<br>"; exit;}
>
> // build the search command
> $cmdstr="zgrep -i $test $file";
> echo "<p><A href=\"$_SERVER[PHP_SELF]\">Return To The Search
> Form</A></p>\n";
> echo "Search results for $test in $file ... <br>";
> flush();
>
> // search the file, display the result
> $fp = popen($cmdstr . ' 2>&1', 'r'); // open proc pointer
> echo "<pre>";
> while ($buffer = fgets($fp, 4096))
> {
> echo("$buffer");
> }
> echo "</pre>";
> echo "<p>END</p>\n";
> pclose($fp);
> flush();
> exit;
> }
>
> ?>
>

I don't see anything else seriously wrong. But as I indicated before,
it isn't the way I'd do it.
>
> <H1>Mail Log Search</h1>
> <p>
> <form action="<?php echo $_SERVER['PHP_SELF']; ?>" method="POST">
> File to search:
> //yes, next step is dynamically fill the list from the available log
> files
> <select name="d_file">
> <option value="maillog">current maillog</option>
> <option value="maillog.1.gz">yesterday maillog</option>
> <option value="maillog.2.gz">previous day maillog</option>
> </select>
> <br>value to search:
> <input type="text" name="sstr" size="30" maxlength="40">
> <input type="submit" name="submit" value="Search!">
> </form>
> </p>
>
> </BODY>
> </HTML>
>
>


--
==================
Remove the "x" from my email address
Jerry Stuckle
JDS Computer Training Corp.
jstucklex@attglobal.net
==================

[Back to original message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация