|
Posted by Toby A Inkster on 02/12/07 13:30
cjl wrote:
> I'm wondering if there is a simpler approach...after all, I want the
> user input to be valid php, I just want to limit what they can type to
> a few functions I write ( circle(), line(), etc..) and a few control
> structures.
If the user input is to be valid PHP, the "obvious" solution is eval(),
but this will totally destroy your security. You could use regular
expressions to check for "naughty" functions (like SQL queries, file
system manipulation, TCP sockets, etc), but then you end up:
(a) playing catch-up with the features of PHP itself. As new
functions are added to the language, you'll need to evaluate
how naughty they are, and add them to the block list.
(b) naively blocking more innocent PHP like:
print "fopen";
(It's worth mentioning, that you'll also need to include in your block
list "naughty" functions from any of your own or third-party libraries you
use.)
> Maybe I could create an object which includes member functions which
> override all native php functions, and have the user input actually be
> calls to that objects methods, and only pass through the ones that I
> want to allow?
Aye -- that is indeed the dream: the ability to have an eval() function
that works within a single object, such that any function calls are
silently re-written to "$this->function()", any globals to
"$this->variable" and any constants to "self::CONSTANT".
Although PHP doesn't have such a "safe eval" function built in, it
shouldn't be too difficult to build one. As your language follows PHP
syntax rules, you can use PHP's built-in tokeniser:
$tokens = token_get_all($source);
Then loop through that list, looking for all tokens of type T_VARIABLE and
re-pointing them at object members; finding T_STRING (which despite the
name is an "non-$-identifier" token, so could be either a function call or
a constant) and heuristically (e.g. UPPERCARE is assumed to be a
constant; MixedOr_lower_case is assumed to be a method.) re-pointing it at
a class constant or object method; and finally finding T_EVAL and
replacing it with T_ECHO. You would then need to loop through the token
list and re-assemble it as source code before passing it through an eval()
function wrapper within the object.
Sounds complicated; but is simpler than implementing your own real parser
and interpreter; and could probably be done in less than 50 lines of code.
I wouldn't be happy running it on a production system though without
substantial hack-testing!
> As far as the approach you are suggesting, some googling showed:
> http://greg.chiaraquartet.net/archives/138-PHP_ParserGenerator-and-PHP_LexerGenerator.html
> Which maybe can help me?
Quite possibly -- it does look quite good. If I'd known of its existence
when I started my scripting language, I might not have attempted to write
a scripting language. But I certainly learnt a lot --especially about OO
PHP -- from doing so, so I don't regret it.
> That is the single best response ever given to a newsgroup post. Thank
> you.
No problem -- I'd guessed that nobody else in this group had been crazy
enough to attempt a scripting language parser and interpreter in PHP, so
if I didn't help you, nobody would!
--
Toby A Inkster BSc (Hons) ARCS
Contact Me ~ http://tobyinkster.co.uk/contact
Geek of ~ HTML/SQL/Perl/PHP/Python*/Apache/Linux
* = I'm getting there!
[Back to original message]
|