Reply to Re: Optimizing a string manipulation script.

Your name:

Reply:


Posted by Cleverbum on 06/12/06 08:43

Alan Little wrote:
> Carved in mystic runes upon the very living rock, the last words of
> <Cleverbum@hotmail.com> of comp.lang.php make plain:
>
> > Alan Little wrote:
> >
> >> Carved in mystic runes upon the very living rock, the last words of
> >> <Cleverbum@hotmail.com> of comp.lang.php make plain:
> >>
> >> > I'm not really accustomed to string manipulation and so I was
> >> > wondering if any of you could be any help i speeding up this script
> >> > intended to change the format of some saved log information into a
> >> > CSV file while removing duplicate records.
> >> > The main problem is that the script currently takes about 20
> >> > seconds to execute, and were it to take much longer it would time
> >> > out.
> >> >
> >> > Below is the script itself, and then some example lines from the
> >> > log file it processes:
> >> >
> >> > <?
> >> > [snip]
> >> > ?>
> >>
> >> Whew!
> >>
> >> How about an example of the output you're trying to achieve? That
> >> might be easier.
> >
> > Here we go then:
>
> Try this:
>
> <?php
> $patt =
> '!([^:]+:) \[([^:]+:\d\d:\d\d:\d\d) [+-](\d{4})\] '.
> '(\d+\.\d+\.\d+\.\d+) (-) (-) "(\w+) (/[^ ]*) '.
> '(HTTP/\d\.\d)" (\d+) (\d+) "([^"]+)" "([^"]+)"'.
> "\n?".'!';
>
> $log = fopen('log.csv', 'a');
>
> $logfile = file_get_contents('logs.txt');
> $logfile = ereg_replace("\r\n?", "\n", $logfile);
>
> preg_match_all($patt, $x, $matches, PREG_SET_ORDER);
>
> foreach($matches as $match) {
> unset($match[0]);
> $logline = implode(',', $match);
> fputs($log, $logline."\n");
> }
>
> fclose($log);
> ?>
>
> I don't know what those two blank log elements are after the IP, so this
> pattern will only work when they're blank.
>
> --
> Alan Little
> Phorm PHP Form Processor
> http://www.phorm.com/

They do seem to stay blank for the entire log, which means it should be
fine. I noticed though that your version produced smaller files that
mine, and on closer inspection I noticed some log lines sent it a
little insane. It seems to have problems when there's no file size
sent, for example on lines with errors. The following log lines are the
ones causing problems:

jpgme.co.uk: [26/May/2006:10:12:38 +0100] 130.88.199.23 - - "GET
/addthumbs.php HTTP/1.1" 200 37 "-" "Mozilla/5.0 (X11; U; Linux i686;
en-US; rv:1.7.12) Gecko/20060210 Fedora/1.7.12-1.3.3.legacy"
jpgme.co.uk: [26/May/2006:10:14:05 +0100] 130.88.199.23 - - "GET
/bulk.php HTTP/1.1" 200 42792 "-" "Mozilla/5.0 (X11; U; Linux i686;
en-US; rv:1.7.12) Gecko/20060210 Fedora/1.7.12-1.3.3.legacy"
jpgme.co.uk: [26/May/2006:10:14:23 +0100] 130.88.199.23 - - "GET
/add.php?folder=./All_work/Flowers HTTP/1.1" 200 21384
"http://www.martinsphotos.co.uk/bulk.php" "Mozilla/5.0 (X11; U; Linux
i686; en-US; rv:1.7.12) Gecko/20060210 Fedora/1.7.12-1.3.3.legacy"
jpgme.co.uk: [26/May/2006:10:16:54 +0100] 130.88.199.23 - - "GET
/add.php?folder=./All_work/Flowers HTTP/1.1" 200 18309
"http://www.martinsphotos.co.uk/bulk.php" "Mozilla/5.0 (X11; U; Linux
i686; en-US; rv:1.7.12) Gecko/20060210 Fedora/1.7.12-1.3.3.legacy"
jpgme.co.uk: [26/May/2006:10:16:54 +0100] 130.88.199.23 - - "GET
/images/folder2.png HTTP/1.1" 304 -
"http://www.martinsphotos.co.uk/add.php?folder=./All_work/Flowers"
"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20060210
Fedora/1.7.12-1.3.3.legacy"
jpgme.co.uk: [26/May/2006:10:16:59 +0100] 130.88.199.23 - - "POST
/do_add.php HTTP/1.1" 200 12746
"http://www.martinsphotos.co.uk/add.php?folder=./All_work/Flowers"
"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20060210
Fedora/1.7.12-1.3.3.legacy"
jpgme.co.uk: [26/May/2006:10:16:59 +0100] 130.88.199.23 - - "GET
/addone.php?alb=61&folder=./All_work/Flowers&file=DSC04827.jpg
HTTP/1.1" 200 23231 "http://www.martinsphotos.co.uk/do_add.php"
"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20060210
Fedora/1.7.12-1.3.3.legacy"
jpgme.co.uk: [26/May/2006:10:16:59 +0100] 130.88.199.23 - - "GET
/addone.php?alb=61&folder=./All_work/Flowers&file=DSC04822_1.JPG
HTTP/1.1" 200 23231 "http://www.martinsphotos.co.uk/do_add.php"
"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20060210
Fedora/1.7.12-1.3.3.legacy"
jpgme.co.uk: [26/May/2006:10:22:52 +0100] 130.88.199.23 - - "GET
/addone.php?alb=61&folder=./All_work/Flowers&file=normal_DSC04962.jpg
HTTP/1.1" 200 23231 "http://www.martinsphotos.co.uk/do_add.php"
"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20060210
Fedora/1.7.12-1.3.3.legacy"
jpgme.co.uk: [26/May/2006:10:22:52 +0100] 130.88.199.23 - - "GET
/addone.php?alb=61&folder=./All_work/Flowers&file=normal_DSC04980.jpg
HTTP/1.1" 200 23231 "http://www.martinsphotos.co.uk/do_add.php"
"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20060210
Fedora/1.7.12-1.3.3.legacy"
jpgme.co.uk: [26/May/2006:10:22:52 +0100] 130.88.199.23 - - "GET
/addone.php?alb=61&folder=./All_work/Flowers&file=normal_DSC05000.jpg
HTTP/1.1" 200 20060 "http://www.martinsphotos.co.uk/do_add.php"
"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20060210
Fedora/1.7.12-1.3.3.legacy"
jpgme.co.uk: [26/May/2006:10:23:00 +0100] 130.88.199.23 - - "GET
/add.php?folder=./All_work/Flowers HTTP/1.1" 200 22952
"http://www.martinsphotos.co.uk/bulk.php" "Mozilla/5.0 (X11; U; Linux
i686; en-US; rv:1.7.12) Gecko/20060210 Fedora/1.7.12-1.3.3.legacy"
jpgme.co.uk: [26/May/2006:10:23:00 +0100] 130.88.199.23 - - "GET
/images/folder2.png HTTP/1.1" 304 -
"http://www.martinsphotos.co.uk/add.php?folder=./All_work/Flowers"
"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20060210
Fedora/1.7.12-1.3.3.legacy"
jpgme.co.uk: [26/May/2006:10:25:50 +0100] 130.88.199.23 - - "GET
/add.php?folder=./All_work/Flowers HTTP/1.1" 200 18309
"http://www.martinsphotos.co.uk/bulk.php" "Mozilla/5.0 (X11; U; Linux
i686; en-US; rv:1.7.12) Gecko/20060210 Fedora/1.7.12-1.3.3.legacy"
jpgme.co.uk: [26/May/2006:10:25:50 +0100] 130.88.199.23 - - "GET
/images/folder2.png HTTP/1.1" 304 -
"http://www.martinsphotos.co.uk/add.php?folder=./All_work/Flowers"
"Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20060210
Fedora/1.7.12-1.3.3.legacy"

[Back to original message]


Удаленная работа для программистов  •  Как заработать на Google AdSense  •  England, UK  •  статьи на английском  •  PHP MySQL CMS Apache Oscommerce  •  Online Business Knowledge Base  •  DVD MP3 AVI MP4 players codecs conversion help
Home  •  Search  •  Site Map  •  Set as Homepage  •  Add to Favourites

Copyright © 2005-2006 Powered by Custom PHP Programming

Сайт изготовлен в Студии Валентина Петручека
изготовление и поддержка веб-сайтов, разработка программного обеспечения, поисковая оптимизация