|
Posted by Toby A Inkster on 10/21/43 12:00
elmosik wrote:
> Toby A Inkster wrote:
>
>> Regular expressions are no good for any non-trivial parsing task. You
>> need a stateful parser.
>
> Any examples of stateful parser?
I've written a small stateful parser that parses a simple programming
language (called "TrivialEncoderScript") which is used to script the
encryption of some input test to some output text.
The idea of TrivialEncoderScript is that you use it to specify a chain of
encryption techniqies: for example the input text might be encoded using
Triple-DES with a passphrase "hello", then Base64-encoded to make it ASCII-
safe, and then Rot13-encoded just for fun. In TrivialEncoderScript, you
could express that process like this:
tripledes "hello";
base64;
rot13;
To decode the message, you'd use TrivialEncoder in decryption mode, with
the same script (you don't need to reverse the order of the operations --
the decoder knows to do this by itself).
As it doesn't have any control structures, this might seem like quite an
easy scripting language, and something that can be handled by regular
expressions. A level of complexity is added though because of the "multi"
encoding technique, which allows the encryption to branch in multiple
directions. Then you might end up with a script like this:
multi
(tripledes "la;la")
(blowfish "la)la"; memfrob)
(multi
(rot47; rijndael512 "la(la")
(cast256)
)
(rc2 "la\"la")
;
hex;
morse;
The output of such a script would just look like morse code, but in
practise to decode it without knowing the script that produced it would
require cracking several of the world's best encryption algorithms!
Anyway, TrivialEncoder/0.2 is here:
http://tobyinkster.co.uk/blog/2007/08/19/trivial-encoder/
The parser for TEScript is the class TE_Parser in file TE_Machine.class.php
(Actually the above complicated TEScript does not work in the current
version. Nothing wrong with the parser -- just a bug in the implementation
of "multi".)
--
Toby A Inkster BSc (Hons) ARCS
[Geek of HTML/SQL/Perl/PHP/Python/Apache/Linux]
[OS: Linux 2.6.17.14-mm-desktop-9mdvsmp, up 18 days, 2:39.]
Gnocchi all'Amatriciana al Forno
http://tobyinkster.co.uk/blog/2008/01/15/gnocchi-allamatriciana/
Navigation:
[Reply to this message]
|