Papers for CMPT-755
Most papers can be downloaded from journal websites that are linked
from the Library website. Others can be found on the web. Otherwise
you can take a trek down to the Library. If you can't track down a
paper, then contact me as I might have a hardcopy. You can also
propose a topic in compiler design or related topics based on a group
of papers along the lines of the topics below.
You are expected to write a survey paper that explains the
ideas from the papers and also how they are connected to topics
studied in class. A exhaustive
guide to writing the survey has been provided.
Finite-state machine determinization
- Gertjan van Noord, The
Treatment of Epsilon Moves in Subset Construction,
Computational Linguistics, 26(1), pp. 61-76, April 2000.
- Ted Leslie, Efficient Approaches to Subset Construction,
masters thesis, Computer Science, University of Waterloo, 1995.
- J. Howard Johnson and Derick Wood, Instruction Computation in
Subset Construction, in Automata Implementation, Darrell Raymond and
Derick Wood and Sheng Yu (eds.), Lecture Notes in Computer Science
1260, pp. 64-71, Springer Verlag, 1997.
Finite-state machine minimization
- The fastest (O(|Q|log|Q|), where |Q| is the number of
states) minimization algorithm:
- John E. Hopcroft, An n log n algorithm for minimizing the
states in a finite automaton, in The Theory of Machines and
Computations, Z. Kohavi (ed.), pp. 189-196, Academic Press, 1971.
- A. V. Aho, J. E. Hopcroft and J. D. Ullman, The Design and
Analysis of Computer Algorithms, Addison-Wesley Publishing Company,
1974.
- David Gries, Describing an Algorithm by Hopcroft, Acta
Informatica, vol. 2, pp. 97-109, 1973.
- George Anton Kiraz and Edmund Grimley-Evans, Multi-tape
Automata for Speech and Language Systems: A Prolog Implementation,
Automata Implementation. Second Internation Workshop on Implementing
Automata, WIA '97, Derick Wood and Sheng Yu (eds.), Springer Lecture
Notes in Computer Science 1436, pp. 87-103, 1997.
- Brzozowski's algorithm
minimizes an automaton by performing reversal and determinization
twice. The subset construction (determinization) is O(2|Q|),
where Q is a set of states:
- J. A. Brzozowski, Canonical regular expressions and minimal
state graphs for definite events, in Mathematical Theory of Automata,
Volume 12 of MRI Symposia Series, pp. 529-561, Polytechnic Press,
Polytechnic Institute of Brooklyn, N.Y., 1962.
- Ted Leslie, Efficient Approaches to Subset Construction,
masters thesis, Computer Science, University of Waterloo, 1995.
- The Hopcroft-Ullman algorithm. Complexity is O(|Q|2),
and it also requires O(|Q|2) memory:
- John E. Hopcroft and Jeffrey D. Ullman, Introduction to
Automata Theory, Languages, and Computation, Adison-Wesley Publishing
Company, Reading, Massachusets, USA, 1979.
- The Aho-Sethi-Ullman algorithm has the same complexity as the
Hopcroft-Ullman algorithm, i.e. O(|Q|2), but it
needs only O(|Q|) memory:
- Alfred V. Aho and Ravi Sethi and Jeffrey D. Ullman, Compilers.
Principles, Techniques and Tools, Addison Wesley, 1986.
- Bruce W. Watson, A Taxonomy of finite automata minimization
algorithms, Eindhoven University of Technology, The Netherlands,
Computing Science Note, 93/44, 1993
Derivatives of Regular Expressions
- Janusz A. Brzozowski. Derivatives of Regular Expressions. JACM 11(4), Oct 1964, pp. 481-494
- G. Berry and R. Sethi. From regular expressions to deterministic automata. Theoretical Computer Science, 48:117-126, 1986. (INRIA Technical Report, RR-0649, Mar 1987) [ps, pdf]
Comparison between Thompson and Glushkov-McNaughton-Yamada automata construction
- S. C. Kleene, Representations of Events in Nerve Sets and Finite Automata, In: C. E. Shannon and J. McCarthy, editors, Automata Studies, pages 3-42, Princeton, NJ, 1956. Princeton University Press.
- K. Thompson. Regular expression search algorithm. Communications of the ACM, 11(6):410-422, 1968.
- V. Glushkov. The abstract theory of automata. Russian Math. Surveys, 16:1-53, 1961.
- R. McNaughton and H. Yamada. Regular expressions and state graphs for automata. IEEE Transactions on Electronic Computers, 9:39-47, 1960.
- A. Brüggemann-Klein, Regular expressions into finite automata, Theoret. Comput. Sci. 120:197-213, 1993.
Antimirov automata
- V. Antimirov, Partial derivatives of regular expressions and finite automaton constructions, Theoret. Comput. Sci. 155:291-319, 1996.
- J.-M. Champarnaud and D. Ziadi, New finite automaton constructions based on canonical derivatives, Proc. of CIAA 2000, LNCS 2088, Springer 2001, 94-104.
- J.-M. Champarnaud and D. Ziadi, Computing the equation automaton of a regular expression in O(s2) space and time, Proc. of 12th Comb. Pattern Matching Conference: CPM 2001, LNCS 2089, Springer 2001, 157-168.
Efficient ε-free NFA construction
- C. Hagenah and A. Muscholl, Computing ε-free NFA from regular expresssions in O(n log2(n)) time, Theor. Inform. Appl. 34(4):257-277, 2000.
- J. Hromkovic and S. Seibert and T. Wilke, Translating regular expressions into small ε-free nondeterministic finite automata, J. Comput. Sys. Sci. 62(4):565-588, 2001.
- L. Ilie and S. Yu, Constructing NFAs by optimal use of positions in regular expressions, In Proc. of 13th Comb. Pattern Matching Conference: CPM 2002.
- C. Allauzen and M. Mohri, A Unified Construction of the Glushkov, Follow and Antimirov Automata, NYU Technical Report (TR 2006-880). [pdf]
Incremental DFA Minimization
- Bruce W. Watson, An incremental DFA minimization algorithm,
Finite State Methods in Natural Language Processing 2001, ESSLLI
Workshop, August 20-24, Helsinki, Finland, 2001.
- Bruce B. Watson, Jan Daciuk, An efficient incremental DFA
minimization algorithm, Natural Language Engineering, Volume 9, Issue
01, March 2003.
Compression of Automata
- Minimization of binary finite automata. C. L. Lucchesi, T. Kowaltowski, and J. Stolfi. Journal of the Brazilian Computer Society, vol. 1 no. 3, 5--11. April 1995. [PS]
- Tomasz Kowaltowski, Cláudio L. Lucchesi, and Jorge Stolfi,
Minimization of Binary Automata, First South American String Processing Workshop, Belo Horizonte, Brasil, 1993.
- Franklin Mark Liang, Word Hy-phen-a-tion by Computer,
Ph.D. thesis, Stanford, 1983.
- Robert Endre Tarjan and Andrew Chi-Chih Yao, Storing a Sparse
Table, Communications of the ACM, 22(11), pp. 606-611, November,
1979.
- André Kempe, Reduction of Intermediate Alphabets in
Finite-State Transducer Cascades, in: Proc. TALN 2000, Lausanne,
Switzerland, pp. 207-215, 2000.
Transforming LR(k) Grammars to LR(1)
Approximations of Context-Free Languages
- Fernando C. N. Pereira and Rebecca N. Wright, Finite-State Approximation of Phrase-Structure Grammars, in Finite-State Language Processing, Emmanuel Roche and Yves Schabes (eds.), pp. 149-173, MIT Press, Cambridge, 1997.
- Mark-Jan Nederhof, Regular approximation of CFLs: a grammatical view, In H. Bunt and A. Nijholt, editors, Advances in Probabilistic and other Parsing Technologies, chapter 12, pages 221-241. Kluwer Academic Publishers, 2000.
- Mark-Jan Nederhof, Context-free Parsing through regular approximation, Finite-state Methods in Natural Language Processing, pp. 13-24, Bilkent University, Ankara, Turkey, 1998.
- Mark-Jan Nederhof, Practical Experiments with Regular Approximation of Context-Free Languages, Computational Linguistics, 26(1), April, 2000.
- Mehryar Mohri, Mark-Jan Nederhof, Regular Approximation of Context-Free Grammars through Transformation. Robustness in Language and Speech Technology, J.-C. Junqua and G. van Noord (eds.), Kluwer Academic Publishers, pp. 153-163, 2001.
- Edmund Grimley Evans, Approximating Context-Free Grammars with a Finite-State Calculus, 35th Annual Meeting of the Association for Computational Linguistics and 8th Conference of the European Chapter of the Association for Computational Linguistics, pp. 452-459, Madrid, 1997.
Containment Problems for deterministic pda's
- R.E. Stearns, A Regularity Test for Pushdown Machines, Information and Control, vol. 11, pp. 323-340, 1967.
- J.S. Ullian, Partial Algorithm Problems for Context Free Languages, Information and Control, vol. 11, pp. 80-101, 1967.
- L.G. Valiant, Regularity and Related Problems for Deterministic Pushdown Automata, Journal of the ACM, 22(1), pp. 1-10, 1975.
- Ibarra, O., and Rosier, L. (1981). On the decidability of equivalence for deterministic pushdown transducers, Information Processing Letters 13, 89-93.
Regular Closure of Deterministic Languages
- E. Bertsch and M.-J. Nederhof. Regular closure of deterministic languages. SIAM Journal on Computing, 29(1):81-102, 1999. [acm]
- M.-J. Nederhof and E. Bertsch. An innovative finite state concept for recognition and parsing of context free languages. In A. Kornai, editor, Extended finite state models of language, chapter 18, pages 226-243. Cambridge University Press, 1999. [pdf]
Size/Lookahead Tradeoff for LR(k) and LL(k) Grammars
Error Correction for CFGs and Regular Languages
- Robert A. Wagner, Order-n correction for regular languages. Communications of the ACM. Volume 17, Issue 5 (May 1974) pp. 265 - 268
- Aho, A. V., and Peterson, T. G. A minimum-distance error-correcting parser for context-free languages. SIAM J. Comput 1(4), 305-312. 1972.
- Lyon, G. 1974. Syntax-directed least-errors analysis for context-free
languages: a practical approach. Commun. ACM 17, 1, 3-14.
- Irons, E.T. An error-correcting parse
algorithm. CACM 6(11) Nov. 1963, pp. 669-673.
- R. Teitelbaum. Context-free error analysis by evaluation of algebraic power series. In 5th STOC, pp. 196-199, 1973.
- Robert A. Wagner, Joel I. Seiferas. Correcting Counter-Automaton-Recognizable Languages. SIAM J. on Computing. Vol 7. No 3, pp. 357-375. Aug 1978.
Substring Parsing
- M.-J. Nederhof and E. Bertsch. Linear-time suffix parsing for deterministic
languages. Journal of the ACM, 43(3):524-554, 1996.
- Bates, J. and Lavie, A. 1994. Recognizing substring of LR(k) languages in linear time. ACM Trans. Prog. Lang. Syst. 16, 3 (May), pp. 1051-1077.
- Cormack, G.V. 1989. An LR substring parser for non-correcting syntax error recovery. SIGPLAN Notices 24, 7. 161-169.
- Lang, B. 1991. Parsing Incomplete Sentences. In Proc. of the 12th Intl. Conf. on Computational Linguistics, vol 1 (Budapest, Hungary, Aug) pp. 365-371.
Non-recursive CFGs
- M.-J. Nederhof and G. Satta. The language intersection problem for non-recursive context-free grammars, Information and Computation, 192(2):172-184, 2004. [ps]
- M.-J. Nederhof and G. Satta. Parsing non-recursive context-free grammars. In 40th Annual Meeting of the Association for Computational Linguistics, Proceedings of the Conference, pages 112-119, Philadelphia, Pennsylvania, USA, July 2002. [pdf]
- M.-J. Nederhof and G. Satta. IDL-expressions: A formalism for representing and parsing finite languages in natural language processing. Journal of Artificial Intelligence Research, 21:287-317, 2004. [link]
- M.-J. Nederhof and G. Satta. IDL-expressions: a compact representation for finite languages in generation systems. In Proceedings of FG Trento, The 7th Conference on Formal Grammar, pages 125-136, Trento, Italy, 2002. [ps]
General Error Recovery in Parsers
- Burke, M. G., and Fisher, G. A. 1987. A practical method for LR and LL syntactic error diagnosis and recovery. ACM Trans. Program.
Lang. Syst. 9, 2, 164-197.
- Graham, S. L., and Rhodes, S. P. 1975. Practical syntactic error
recovery. Commun. ACM 18, 11, 639-649.
- Feycock, S., and Lazarus, P. 1976. Syntax-directed correction of syntax errors. Softw. Pract. Exper. 6, 2, 207-219.
- Hammond, K., and Rayward-Smith, V. J. 1984. A survey on syntactic error recovery and repair. Comput. Lang. (Elmsford, NY) 9, 1, 51-67.
- Mauney, J., and Fischer, C. N. 1988. Determining the extent of lookahead in syntactic error repair. ACM Trans. Program. Lang. Syst. 10, 3, 456-469.
- Richter, H. 1985. Noncorrecting syntax error recovery. ACM Trans. Prog. Lang. Syst. 7, 3 (July) 478-489.
Handling Syntax Errors in LR Parsers
- Mickunas, M. D., and Modry, J. A. 1978. Automatic error recovery for LR parsers. Commun. ACM 21, 6, 459-465.
- Sippu, S., and Soisalon-Soininen, E. 1982. Practical error recovery in LR parsing. Conf. Record Ninth Annual ACM Symposium on
Principles of Programming Languages, ACM, New York, NY, 177-184.
- Sippu, S., and Soisalon-Soininen, E. 1983. A syntax-error handling technique and its experimental analysis. ACM Trans. Program. Lang. Syst. 5, 4, 656-679.
- Sippu, S., and Soisalon-Soininen, E. 1980. A scheme for LR(k) parsing with error recovery, Part 1: LR(k) parsing, Part 2: Error Recovery, Part 3: Error Correction. Intern. J. Computer Math. 8: 27-42 (Part 1), 107-119 (Part 2), 189-206 (Part 3).
- McKenzie, B., Yeatman, C., and De Vere, L. Error repair in shift-reduce parsers. ACM Transactions on Programming Languages and Systems, 17(4):672-689, July 1995.
- E. Bertsch and M.-J. Nederhof. On failure of the pruning technique in ``Error Repair in Shift-Reduce Parsers''. ACM Transactions on Programming Languages and Systems, 21(1):1-10, January 1999. [pdf]
- Graham, S. L., Haley, C. B., and Joy, W. N. 1979. Practical LR error recovery. SIGPLAN Notices 14, 8, 168-175.
Handling Syntax Errors in LL Parsers
- Fischer, C.N. and Mauney, J. 1992. A simple, fast, and effective LL(1) error repair algorithm. Acta Inf. 29, 109-120.
- Fischer, C. N., Milton, D. R., and Quiring, S. B. 1980. Efficient LL(1) error correction and recovery using only insertions. Acta Inf.
13, 141-154.
- Fischer, C. N., Tai, K. C., and Milton, D. R. 1979. Immediate error detection in strong LL(1) parsers. Inf. Process. Lett. 8, 261-266.
- Hammond, K., and Rayward-Smith, V. J. 1984. A survey on syntactic error recovery and repair. Comput. Lang. (Elmsford, NY) 9, 1, 51-67.
- Mauney, J., and Fischer, C. N. 1988. Determining the extent of lookahead in syntactic error repair. ACM Trans. Program. Lang. Syst. 10, 3, 456-469.
Deductive Parsing and Semirings
Bounded-context parsing
- R. W. Floyd. Bounded context syntactic analysis. CACM(7) 2, pp. 62-67. 1964.
- J. H. Williams. Bounded context parsable grammars. Inf. Control(28) 4, pp. 314-334. 1975.
- David A. Workman. SR(s,k) parsers: A class of shift-reduce bounded-context parsers. Journal of Computer and System Sciences. Vol 22, Issue 2, pp. 178-197.
Space Efficient LR Parsing
- F. DeRemer, Simple LR(k) grammars. CACM 14, 7 (1971), 453-460.
- F. DeRemer and T. Pennello, Efficient Computation of LALR(1) Look-Ahead Sets. ACM TOPLAS Vol 4, No 4, (1982), 615-649
- J. C.-H. Park, K. M. Choe and C.-H. Chang, A New Analysis of LALR Formalisms. ACM TOPLAS Vol 7, No 1 (Jan. 1985), 159-175
Unbounded Lookahead in LR Parsing
- LR-Regular (LRR): K. Culik and R. Cohen, LR-regular grammars -- an extension of LR(k) grammars. J. Comp. Sys. Sci. 7, (1973), 66-96.
- Extended LR (XLR): T. P. Baker, Extending Lookahead for LR Parsers. J. Comp. Sys. Sci. 22 (1981), 243-259.
- LAR(m): M. E. Bermudez and K. M. Schimpf, A Practical Arbitrary Look-ahead LR Parsing Technique. ACM SIGPLAN Notices (1986), 136-144.
Non-canonical extensions to LR Parsing
- LR(k,t) grammars: (included in) D. Knuth, On the translation of languages from left to right. Inf. Control 8 (1965), 607-639.
- T. G. Syzmanski and J. H. Williams, Noncanonical extensions of bottom-up parsing techniques. SIAM J. Comp. 5, 2 (June 1976), 321-350.
- K.-C. Tai, Noncanonical SLR(1) Grammars. ACM TOPLAS, Vol 1 No 2 (Oct 1979), 295-320.
- Survey paper: M. D. Hutton, Noncanonical Extensions of LR Parsing Methods. unpublished manuscript (available online).
Generalized LR Parsing
- B. Lang. 1974. Deterministic techniques for efficient non-deterministic parsers. In Jacques Loeckx, editor, Automata, Languages and Programming, 2nd Colloquium, University of Saarbrücken. Lecture Notes in Computer Science, Springer Verlag.
\
- M. Tomita. 1987. An efficient augmented context-free parsing algorithm. Computational Linguistics, 13:31-46.
- M. Tomita. 1985. Efficient Parsing for Natural Language, A Fast Algorithm for Practical Systems. Kluwer Academic Publishers.
- Y. Schabes. 1991. Polynomial Time and Space Shift-Reduce Parsing of Arbitrary Context-free Grammars. Proc. of ACL-1991, 29th Annual Meeting of the Association of Computational Linguistics.
- D. Grune and C.J.H. Jacob. Parsing Techniques: a practical guide. Ellis Horwood, Chichester, 1990.
Earley Parsing for Programming Language Grammars
- J. Earley, An Efficient Context-free Parsing Algorithm. CACM (13) 2, Feb 1970, 94-102.
- P. MacLean and R. N. Horspool, A Faster Earley Parser. Proceedings of Intl. Conference on Compiler Construction (CC'96), Linkoping, Sweden, April 1996, LNCS vol. 1060, Springer-Verlag, pp. 281-293.
- J. Aycock and R. N. Horspool. Practical Earley Parsing. Computer Journal, vol. 45, 6 (2002), pp. 620-630.
- J. Aycock and R. N. Horspool. Directly-Executable Earley Parsing. Proceedings of CC 2001, Compiler Construction Conference, Genoa, Italy, April 2001, LNCS 2027, Springer-Verlag, pp. 229-243.
GHR optimizations for context-free parsing
- S .L. Graham, M. A. Harrison, and W. L. Ruzzo. 1980. An improved context-free recognizer. ACM Transactions on Programming Languages and Systems, 2(3):415-462, July.
- H. Leiss. 1990. On Kilbury's modification of Earley's algorithm. ACM Transactions on Programming Languages and Systems, 12(4):610-640, October.
Code Generation using bottom-up tree parsing
- C. W. Fraser, D. R. Hanson, and T. A. Proebsting. Engineering a simple, efficient code-generator generator. ACM Letters on Programming Languages and Systems, 1(3):213--226, September 1992.
- C. W. Fraser, R. R. Henry, and T. A. Proebsting. BURG---fast optimal instruction selection and tree parsing. ACM SIGPLAN Notices, 27(4):68--76, April 1992.
- T. A. Proebsting. BURS automata generation. ACM Transactions on Programming Languages and Systems, 17(3):461--486, May 1995.
Syntax-directed Translation
- P. M. Lewis and R. E. Stearns, Syntax-Directed Transduction. JACM vol 15, no 3, July 1968.
- A. V. Aho and J. D. Ullman, Properties of Syntax Directed Translations. Journal of Comp. and Sys. Sci. 3, 319-334, 1969.
- A. V. Aho and J. D. Ullman, Syntax Directed Translations and the Pushdown Assembler. Journal of Comp. and Sys. Sci. 3, 37-56, 1969.
- A. V. Aho and J. D. Ullman, Translations on a Context Free Grammar. Inf. and Control 19, 439-475, 1971.
- S. Vere, Translation Equations. CACM, vol 13 no 2, Feb 1970
Tree Transducers for Code Generation
- Comon et al. Chapter 6, Tree Automata Techniques and Applications. unpublished book.
- J.W. Thatcher, Tree automata: an informal survey, Currents in the Theory of Computing (Alfred V.Aho, ed.), chapter 4, pp. 143--172, Prentice-Hall, 1973.
- Gesecg and Steinby, Tree automata, book
Other ideas for topics
- Generalised recursive descent parsing and follow determinism, Adrian Johnstone and Elizabeth Scott, in: Kai Koskimies ed, Proc. 7th Intnl. Conf. Compiler Construction (CC'98), Lecture Notes in Computer Science, 1383, Springer-Verlag, Berlin 16-30 (1998)
- Geller and Harrison, On LR(k) Grammars and Languages, TCS 4:245--276, 1977
- Jan Daciuk's list of FSM algorithms
- A. Bergeron and S. Hamel, Fast Implementations of Automata Computations, Pre-Proc. 5th Int. Conf. on Implementation and
Application of Automata (CIAA), M. Daley, M. G. Eramian, and S. Yu (eds.), pp. 16-25, the University of Western Ontario, London, Canada, July 2000.
- Horspool, R. N., and Whitney, M. 1990. Even faster LR parsing. Softw. Pract. Exper. 20, 6, 515-535.
- D.E. Knuth. Semantics of context-free languages. Mathematical Systems Theory, 2(2):127-145, 1968.
- Tai, K. C. 1980b. Predictors of context-free grammars. SIAM J. Comput. 9, 3, 653-664.
- Brenda S. Baker: Generalized Syntax Directed Translation, Tree Transducers, and Linear Space. SIAM J. on Computing. Vol 7. No 3, Aug 1978. pp. 376-391.
- Bruce W. Watson, Taxonomies and Toolkits of Regular Language Algorithms, Ph.D. thesis, Eindhoven University of Technology, the Netherlands, 1995.
- E. Shamir, A Representation Theorem for Algebraic and Context-Free Power Series. Inf. and Control (11), 1967, p. 239-254.