Design compilers and languages without sweating the *wrong* details.

The Compiler Construction Toolkit is a compiler design protoyping suite. The constituent tools aid in building compiler components and learning about compilers.

This is likely primarily of interest to students and hobbyists. Students will find themselves able to verify homework solutions or the correctness of assignments. (If the tools choke, the instructor may have made a mistake or is testing you by providing a recursive grammar, bad input, etc.) Hobbyists can skip unnecessary theory and just crank out lexers or parsers from simple specs, as though using Lex/yacc/Bison, but with a shallower learning curve and from the convenience of their browsers.

It provides a couple learning aids, a web-based lexical scanner generator, and some parser construction tools. Here's the run-down in slightly more depth.

The NFA to DFA converter allows you to verify (or skip) manual determinization of non-deterministic finite automata. This uses the subset construction algorithm (Sipser p55, Dragon Book 2E p153). This relates to Computation / Complexity Theory courses, too.

The regular expression to finite automata tool runs the Thompson-McNaughton-Yamada construction algorithm described in Dragon Book 2E (p159).

There's a tool to compute first, follow, & predict sets. It does what it says on the tin.

The scanner generator builds lexical analyzers and lets you test and tweak them from the convenience of your browser. It's a life-saver. I also made graph and parser table generators for bottom-up LR(0) parsers and SLR(1) parsers. There's a top-down LL(1) parser table creator too. (GLR, LALR, and LL(*) recursive-descent parsers are yet to come.)

The scanner and parsers are coded with OO in mind, yielding arrays of tuples for ready traversal by other components.

Parser drivers (for undergraduate coursework, anyway) are simple. The "difficulty" (read: tedium) lies is deriving the appropriate tables for the type of parser you're building. These come from things like graphs and/or first, follow, and predict sets. Luckily, my parser tools computes them directly from grammar input as a terse subset of EBNF.

Note that the parser generators don't as yet dump full drivers; just tables. (The misnomers should be temporary!) As time allows, I intend to give them the interactive Javascript treatment, doing something similar to what I did for the scanner generator.

This saves the souls of compiler course students. Trust me.

Given a set of *name:regexp* token definition, this tool builds their NFAs, ties the NFAs into a bigger NFA which is graphed, determinizes it resulting in a DFA which is also graphed, shows scanner code (including the tables) in Ruby, and runs the scanner as Javascript so that users can verify and tweak their definitions in real time.

LL(1) parser tool turns EBNF into the following:

- first, follow, & predict set listings
- parse tables in JSON for easy import

LR(0) parser tool turns EBNF into the following:

- NFA & DFA of the grammar's items
- pretty parse tables and JSON table dumps

The SLR(1) parser tool's similar to the LR(0) one:

- NFA & DFA of the grammar's items
- pretty parse tables and JSON table dumps