|
Posted by Steve on 02/12/06 20:43
> what is the input size limit using preg_match_all? i've got a preg pattern
> that i've been successfully using on a file for some time now. the file size
> got up to a little over 6mb and the preg now just returns an empty string.
> i've got php's memory limit set to 100 mb and execution time out set to 0. i
> also have errors and warnings turned on (i even turned on notices too). the
> script runs without raising errors of any kind and almost instantly returns
> from the preg call when using a large input string...whereas a < 6mb input
> string takes some time to return - as preg is actually doing some
> processing.
Excepted from the PCRE documentation <http://www.pcre.org/pcre.txt>...
<snip>
LIMITATIONS
There are some size limitations in PCRE but it is hoped that
they will
never in practice be relevant.
The maximum length of a compiled pattern is 65539 (sic) bytes
if PCRE
is compiled with the default internal linkage size of 2. If you
want to
process regular expressions that are truly enormous, you can
compile
PCRE with an internal linkage size of 3 or 4 (see the README
file in
the source distribution and the pcrebuild documentation for
details).
In these cases the limit is substantially larger. However, the
speed
of execution will be slower.
All values in repeating quantifiers must be less than 65536.
The maxi-
mum number of capturing subpatterns is 65535.
There is no limit to the number of non-capturing subpatterns,
but the
maximum depth of nesting of all kinds of parenthesized
subpattern,
including capturing subpatterns, assertions, and other types of
subpat-
tern, is 200.
The maximum length of a subject string is the largest positive
number
that an integer variable can hold. However, when using the
traditional
matching function, PCRE uses recursion to handle subpatterns and
indef-
inite repetition. This means that the available stack space may
limit
the size of a subject string that can be processed by certain
patterns.
</snip>
Short answer: it depends on the pattern you are matching. If you can,
break it down into a number of simpler patterns to see if you can work
around the limitation (if it indeed is the problem.)
You are using preg_match_all() which tries to do everything in one
gulp. You could also do successive calls to preg_match() instead to see
if you are overflowing the number of matches that you can handle at
once.
---
Steve
Navigation:
[Reply to this message]
|