You are here: Re: processing raw logs faster « PHP Programming Language « IT news, forums, messages
Re: processing raw logs faster

Posted by Colin McKinnon on 10/10/67 11:25

Mladen Gogala wrote:

> On Wed, 31 Aug 2005 17:27:53 -0700, changereality@gmail.com wrote:
>
>>
>> What would you suggest?
>
> You can get a decent database from http://www.postgresql.org. It's kinda
> better suited for heavy OLTP processing then MySQL. Not as good as Oracle
> RDBMS, but definitely getting there.
>

No - he wanted it to go faster.

changereality@gmail.com wrote:

> $list = array();
> $buffer = fgets($handle, 20000);
>
> if (! preg_match("/^\s*?#/", $buffer) ){

Here's your first problem. Regexes are slow. My PERL RE's are a bit a bit
rusty - but that looks a bit suspect anyway. Try coding it without REs.

> $stmt = "INSERT INTO logs ( `hit_date` , `hit_time` , `s-sitename` ,
> `s-computername` , ".
> "`s-ip` , `cs-method` , `cs-uri-stem` , `cs-uri-query` ,
> `s-port` , `cs-username` , `c-ip` , ".
> "`cs-version` , `User-Agent` , `Cookie` , `Referer` ,
> `cs-host` , `sc-status` , `sc-substatus` , ".
> "`sc-win32-status` , `sc-bytes` , `cs-bytes` , `time_taken` )
> ".

Join the strings together - OK it doesn't help the readability - but you
will get some performance benefit. Actually it would be a lot better to
move the invariant parts outside the loop:

$stub="INSERT INTO logs....VALUES(";
while (!feof($handle)) {
....
$stmt=$stub . "'".$line[0]."', '".$line[1]."', '".$line[2]."',

You could try a more functional approach to generating the VALUES clause -
something like:

$stmt = $stub . "'" . implode("','",$line) . "')";

This could be more efficient:

> if( $linecnt >= 10000 ){
> $totalcnt += $linecnt;
> echo "[ ".$totalcnt." ( ". ( time() - $start_time) ." )
> ]\t";
> $linecnt = 0;
> }

Instead:

if (!($linecnt % 10000)) {
echo "[ ".$linecnt." ( ". ( time() - $start_time) ." ) ]\t";
}

You should also get a boost by using INSERT DELAYED (assuming your DBMS and
table are compliant)

$stub="INSERT DELAYED INTO logs....VALUES(";

HTH

C.

 

Navigation:

[Reply to this message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация