|  | Posted by Phil on 11/01/07 21:26 
On Nov 1, 2:02 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote:> Phil wrote:
 > > On Nov 1, 5:30 am, Jerry Stuckle <jstuck...@attglobal.net> wrote:
 > >> Phil wrote:
 > >>> On Oct 31, 9:58 pm, Jerry Stuckle <jstuck...@attglobal.net> wrote:
 > >>>> Phil wrote:
 > >>>>> I should point out that I know this has something to do with the file-
 > >>>>> size. With a small file it works OK. With a large file it fails to the
 > >>>>> browser. In this case, a large file is up to 3.5GB of uncompressed
 > >>>>> ASCII text. There may be up to 100 files to search - I can do that
 > >>>>> part I think, once I get it to actually search the file! :-) Any help
 > >>>>> GREATLY appreciated.
 > >>>>> Probably I should mention that I am not a full-time developer ... I'm
 > >>>>> quite good with shell scripts, REGEX and procedural languages ... I'm
 > >>>>> a hacker at best with PHP :-) I'm more of a systems person.
 > >>>>> On Oct 31, 5:45 pm, Phil <phillip.corch...@gmail.com> wrote:
 > >>>>>> I cannot figure why this works fine at the command-line of the linux
 > >>>>>> server, but will not output anything to the browser, and no errors in
 > >>>>>> the error_log (syslog). The goal here is to have a page to enter a
 > >>>>>> search term and grep or zgrep pattern matches in the file (will be a
 > >>>>>> log file). Probably there is a better way to do this using php, but
 > >>>>>> this is what I was able to come up with using my limited php-
 > >>>>>> knowledge. Can anyone help me debug this?
 > >>>>  > I should point out that I know this has something to do with the file-
 > >>>>  > size. With a small file it works OK. With a large file it fails to the
 > >>>>  > browser. In this case, a large file is up to 3.5GB of uncompressed
 > >>>>  > ASCII text. There may be up to 100 files to search - I can do that
 > >>>>  > part I think, once I get it to actually search the file! :-) Any help
 > >>>>  > GREATLY appreciated.
 > >>>>  > Probably I should mention that I am not a full-time developer ... I'm
 > >>>>  > quite good with shell scripts, REGEX and procedural languages ... I'm
 > >>>>  > a hacker at best with PHP :-) I'm more of a systems person.
 > >>>> (Top posting fixed)
 > >>>> Well, some code would help.  But are you possibly running out of memory
 > >>>> and/or execution time? Those are the two main things which cause
 > >>>> problems with large amounts of data but not small.
 > >>>> Anything in your PHP error log?
 > >>> there is squat from logs :(
 > >>> This really seems to be an issue of file size.
 > >>> Even this simple read/echo for a file of only 9200 lines it fails, say
 > >>> nothing of my other
 > >>> files of over 2million lines.
 > >>> I have max_execution set to 3000 sec, so i don't think that is it. I
 > >>> don't see too much in
 > >>> php.ini that controls memory handling. "memory limit" is 32MB and post
 > >>> limit is 8PM (default).
 > >>> This is Apache/2.0.52 and PHP 4.3.9 by the way on a dual dual-core P4
 > >>> 3Ghz with 4GB RAM and CentOS 4.4 (32bit) with a 2.6.9-42.0.3.ELsmp
 > >>> kernel
 > >>>         $file="/var/log/httpd/access_log";
 > >>>         $handle = @fopen("$file", "r");
 > >>>         if ($handle) {
 > >>>                 echo "<pre>";
 > >>>                 while (!feof($handle)) {
 > >>>                         $buffer = fgets($handle, 2048);
 > >>>                 print_r ($buffer);
 > >>>                 ob_flush();
 > >>>                 }
 > >>>                 echo "</pre>";
 > >>>         fclose($handle);
 > >>>         }
 > >>> Sorry for Top Post. See GP for original code snip.
 > >> OK, in your php.ini file, ensure you have
 >
 > >> error_reporting=E_ALL
 > >> display_errors=On
 >
 > >> for testing purposes.
 >
 > >> If this is a production system, I don't recommend having
 > >> display_errors=On.  Rather, I use log_errors=On and set a log file.
 >
 > >> P.S. Don't mind the trolls here.  Someone is starved for attention.
 >
 > > I already have those options set.
 > > Actually, I partly found the answer, and it was a beginners
 > > mistake ... file permissions, as the logs are root-owned and apache/
 > > php runs as apache - a little 'is_readable' helped me out there. Now I
 > > have to decide if i want to have apache run as root, or modify system
 > > log files. It is 'production' system, but it's an enterprise internal
 > > only single-purpose and with Change Control I can do anything I need
 > > for this.
 >
 > Ah, didn't think about the file not being readable. :-)
 >
 > You NEVER want to run Apache as root.  If someone hacks your site, they
 > have access to EVERYTHING.
 >
 > > I'd still be interested in quid-pro-quo on using external grep versus
 > > some internal preg_grep (and how to make that one work ... I've been
 > > stumped so far).
 >
 > > Thanks! phil
 >
 > Depends on what you're going to do.  If all you're doing is searching an
 > array, just use preg_grep().  Or, you can read a small file into an
 > array with file() and use preg_grep().
 >
 > If you need to do more than one file or a large file, then an external
 > grep is probably better.
 >
 > But if I'm searching, I generally have what I'm looking for in a database.
 
 Well, what I'm searching is sendmail logs on a central syslog for a
 cluster of sendmail servers. I think someday I should look into a
 syslog->DB gateway, but currently what I have is 4 months (rotated)
 archived syslog of about 3.5 million lines per file. Instead of have a
 CustSvc guy place an Ops request to have some one grep logs, I want to
 let them do it themselves with a web page.
 
 Sounds like you think that the php calling grep (or zgrep in this
 case) is the best solution ... and given what I've seen of php
 performance supporting 10x and 100x thousands of connections, I think
 it's agree. This is pretty low volume, but I trust *nix to balance the
 loads with grep commands running with a pipe back to php more than I
 think i trust php to do it all internally.
 
 The server is internal only, but I understand your comment about
 apache as root. What I did was changed the maillog files to 755, and
 added 'apache' user to the 'root' group in *nix. Works good.
 
 Thanks for everything. Here is my current code, lemme know if you see
 anything I've done grossly wrong...
 
 <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
 <HTML>
 <HEAD>
 <TITLE>Search Mail Logs</TITLE>
 </HEAD>
 <BODY>
 
 <?php
 // since we use POST method,  get global value into 'test' string
 $test=$_POST['sstr'];
 
 // check if submit button pressed and some search value entered
 if (isset($_POST['submit']) && ! empty($test) )
 {
 // set file and path to search
 $file="/var/log/" . $_POST[d_file];
 
 // check the file is readable
 if (! is_readable ($file)) {
 print "ERROR accessing file $file, please contact the
 admin.<br>"; exit;}
 
 // build the search command
 $cmdstr="zgrep -i $test $file";
 echo "<p><A href=\"$_SERVER[PHP_SELF]\">Return To The Search
 Form</A></p>\n";
 echo "Search results for $test in $file ... <br>";
 flush();
 
 // search the file, display the result
 $fp = popen($cmdstr . ' 2>&1', 'r');  // open proc pointer
 echo "<pre>";
 while ($buffer = fgets($fp, 4096))
 {
 echo("$buffer");
 }
 echo "</pre>";
 echo "<p>END</p>\n";
 pclose($fp);
 flush();
 exit;
 }
 
 ?>
 
 
 <H1>Mail Log Search</h1>
 <p>
 <form action="<?php echo $_SERVER['PHP_SELF']; ?>" method="POST">
 File to search:
 //yes, next step is dynamically fill the list from the available log
 files
 <select name="d_file">
 <option value="maillog">current maillog</option>
 <option value="maillog.1.gz">yesterday maillog</option>
 <option value="maillog.2.gz">previous day maillog</option>
 </select>
 <br>value to search:
 <input type="text" name="sstr" size="30" maxlength="40">
 <input type="submit" name="submit" value="Search!">
 </form>
 </p>
 
 </BODY>
 </HTML>
  Navigation: [Reply to this message] |