The PCCTS project began as a parser-generator
project for a graduate course at Purdue University in the Fall of 1988 taught by Hank
Dietz--translator-writing systems. Under the guidance of Professor Dietz, the
parser generator, ANTLR (originally called YUCC), continued after the termination of the
course and eventually became the subject of Terence Parrs Masters thesis.
Originally, lexical analysis was performed via a simple scanner generator which was soon
replaced by Will Cohens DLG in the Fall of 1989 (DFA-based lexical-analyzer
generator, also an offshoot of the graduate translation course).
The alpha version of ANTLR was totally rewritten resulting in 1.00B. Version
1.00B was released via an internet newsgroup (comp.compilers) posting in February of 1990
and quickly gathered a large following. 1.00B generated only LL(1) parsers, but allowed
the merged description of lexical and syntactic analysis. It had rudimentary attribute
handling similar to that of YACC and did not incorporate rule parameters or return values;
downward inheritance was very awkward. 1.00B-generated parsers terminated upon the first
syntax error. Lexical classes (modes) were not allowed and DLG did not have an interactive
mode.
Upon starting his Ph.D. at Purdue in the Fall of 1990,
Terence Parr began the second total rewrite of ANTLR. The method by which grammars may be
practically analyzed to generate LL(k) lookahead information was discovered in August of
1990 just before Terences return to Purdue. Version 1.00 incorporated this algorithm
and included the AST mechanism, lexical classes, error classes, and automatic error
recovery; code quality and portability were higher. In February of 1992 1.00 was released
via an article in SIGPLAN Notices. Peter Dahl, then Ph.D. candidate, and Professor Matt
OKeefe (both at the University of Minnesota) tested this version extensively. Dana
Hoggatt (Micro Data Base Systems, Inc.) tested 1.00 heavily.
Version 1.06 was released in December 1992 and represented a
large feature enhancement over 1.00. For example, rudimentary semantic predicates were
introduced, error messages were significantly improved for k>1 lookahead and ANTLR
parsers could indicate that lookahead fetches were to occur only when necessary for the
parse (normally, the lookahead pipe was constantly full). Russell Quong joined
the project in the Spring of 1992 to aid in the semantic predicate design. Beginning and
advanced tutorials were created and released as well. A makefile generator was included
that sets up dependencies and such correctly for ANTLR and DLG. Very few 1.00
incompatibilities were introduced (1.00 was quite different from 1.00B in some areas).
Version 1.10 was released on August 31, 1993 after
Terences release from Purdue and incorporated bug fixes, a few feature enhancements
and a major new capability--an arbitrary lookahead operator (syntactic predicate),
(a)?b. This feature was codesigned with Professor Russell Quong also at
Purdue. To support infinite lookahead, a preprocessor flag, ZZINF_LOOK, was created that
forced the ANTLR() macro to tokenize all input prior to parsing. Hence, at any moment, an
action or predicate could see the entire input sentence. The predicate mechanism of 1.06
was extended to allow multiple predicates to be hoisted; the syntactic context of a
predicate could also be moved along with the predicate.
In February of 1994, SORCERER was released. This tool allowed
the user to parse child-sibling trees by specifying a grammar rather than building a
recursive-descent tree walker by hand. Aaron Sawdey at The University of Minnesota became
a second author of SORCERER after the initial release. On April 1, 1994, PCCTS 1.20 was
released. This was the first version to actively support C++ output. It also included
important fixes regarding semantic predicates and (..)+ subrules. This version also
introduced token classes, the not operator, and token ranges.
On June 19, 1994, SORCERER 1.00B9 was released. Gary Funck of
Intrepid Technology joined the SORCERER team and provided very valuable suggestions
regarding the transform mode of SORCERER.
On August 8, 1994, PCCTS 1.21 was released. It mainly cleaned
up the C++ output and included a number of bug fixes.
From the 1.21 release forward, the maintenance and support of
all PCCTS tools was picked up by Parr Research Corporation.
A sophisticated error handling mechanism called parser
exception handling was released for version 1.30. 1.31 fixed a few bugs.
Release 1.33 is the version corresponding to the initial book
release.
ANTLR 2.0.0 came out around May 1997 and was partially funded
so Terence hired John Lilley, a maniac coder and serious ANTLR hacker, to build much of
the initial version. Terence did the grammar analyzer, naturally.
John Mitchell, Jim Coker, Scott Stanchfield, and Monty
Zukowski donate lots of brain power to ANTLR 2.xx in general.
ANTLR 2.1.0, July 1997, mainly improved parsing performance,
descreased parser memory requirements, and added a lot of cool lexer features including a
case-insensitivity option.
ANTLR 2.2.0, December 1997, saw the introduction of the new http://www.antlr2.org website. This release also
added grammar inheritance, enhanced AST support, and enhanced lexical translation support
(each lexical rule now was considered to return a Token object even when referenced by
another lexical rule).
ANTLR 2.3.0, June 1998, was the first version to have Peter
Wells C++ code generator.
ANTLR 2.4.0, September 1998, introduced the ParseView parser
debugger by Scott Stanchfield. This version also had a semi-functional -html option
to generate HTML from your grammar for reading purposes. Scott and Terence updated
the file I/O to be JDK 1.1.
ANTLR 2.5.0, November 1998, introduced the filter option for
the lexer that lets ANTLR behave like SED or AWK.
ANTLR 2.6.0, March 1999, introduced token streams.
Chapman Flack, Purdue Graduate student, pounded me at the right moment about streams,
nudging me in the right direction.
MageLang Institute currently provides support and continues
development of ANTLR.
MageLang becomes jGuru.com as we quit doing Java training and start building the jGuru Java developer's website.
2.7.0 released January 19, 2000
had the following enhancements: 2.7.1 released October 1, 2000 had the following enhancements
2.7.2 release January 19, 2003 was mainly a bug fix release but
also included a C# code generator by Micheal Jordan, Kunle Odutola
and Anthony Oguntimehin. :) I added an antlr.build.Tool 'cause I
hate ANT. This release does UNICODE properly now. Added limited
lexical lookahead hoisting. Sather code generator disappears.
Source changes for Eclipse and NetBeans by Marco van Meegen and
Brian Smith.
2.7.3 released March 22, 2004 was mainly a bug fix release, but
included the parse-tree/derivation code to aid in debugging plus the cool TokenStreamRewriteEngine that makes rewriting or tweaking input files particularly easy.
2.7.4 released May 9, 2004 was mainly a bug fix release for C++ and
C# generators.
2.7.5 released January 28, 2005 was mainly a release for the Python
code generator and a provided number of bug fixes. Wolfgang
Häfelinger and Marq Kole joined the project to handle the Python!
Terence is working fiendishly on the 3.0 version of ANTLR, a
complete rewrite with a powerful new parsing engine called LL(*) that
provides the efficiency of fixed lookahead but can throttle up
automatically to arbitrary lookahead when needed. It is extremely
clean in both the source code and the grammar meta-language. The code
generation is extremely flexible (based upon StringTemplate) and makes
retargetting trivial. Ric Klaren built a C code generator in 2 days
w/o having seeing either new ANTLR or StringTemplate before. :)
Expect an early release program in Spring 2005. Jean Bovet, a
graduate student in CS here at University of San Francisco is working
on a GUI IDE.
And had a Sather code generator.