Anoop Sarkar
Office: | ASB 10859 |
Phone: | 291-4933 |
Email: | anoop at cs.sfu.ca |
CMPT 882-3: Fall 2002
Monday | 4:00p - 5:50p in SCB 8662 |
Wednesday | 3:30p - 4:20p in SCB 8662 |
Office Hrs | by appointment (send me email) |
In this course we will study basic algorithms that produce state of the art results on tasks involving natural language text. For each of these tasks, we will compare knowledge-rich approaches which use a lot of human supervision to knowledge-poor techniques which use parameter re-estimation or bootstrapping algorithms. We will also compare generative models (models which maximize likelihood of the training data) with discriminative models (models which minimize classification error rate).For more details: Course Description for CMPT 882-3
Notes: Lecture #01 pdf
Unsupervised Word Sense Disambiguation Rivaling Supervised Methods (1995). David Yarowsky. Proceedings of ACL-95. pp. 189-196
Notes: Lecture #02 pdf
Additional Readings:
Data:
Homework:
de_VBP training_VBG new_JJ Ukrainian_JJ plant _NN operators_NNS to_TO replace_VB Russi _NNPS who_WP are_VBP leaving_VBG the_DT plant s_NNS in_IN Ukraine_NNP and_CC improving N and_CC safety_NN procedures_NNS at_IN plant s_NNS in_IN both_DT countries_NNS ,_, sa iet-designed_JJ reactors_NNS at_IN a_DT plant _NN in_IN the_DT Czech_NNP Republic_NNP er_JJ to_TO pay_VB to_TO make_VB the_DT plant _NN safer_JJR ._. er_JJ to_TO pay_VB to_TO make_VB the_DT plant _NN safer_JJR ._. ,_, ''_'' she_PRP said_VBD ,_, are_VBP plant _NN moratoriums_NNS ._. _NNS at_IN the_DT Orange_NNP County_NNP plant _NN ._.
Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval (1998). David Lewis. Proceedings of ECML-98 (10th meeting). pp. 4-15.
Notes: Lecture #03 pdf
Additional Readings:
Homework:
20Newsgroups
and the WebKB
datasets available from the CMU Text Learning page.
table.tar.gz
and bayes.tar.gz
. Use the
data files that come with the package.
Chapters 3 and 4 of Statistical Language Learning. Eugene Charniak. MIT Press. 1993.
Notes: Lecture #04 pdf
Other Sources:
Additional Readings:
Homework:
create_model
from this directory
(you can use
the "Plant" dataset as your corpus, for example). Then
produce a sample output using trigen.pl
(also in
the same directory). Read the file
readme.txt for further instructions.
Does Baum-Welch Re-estimation help taggers? (1994). David Elworthy. Proceedings of 4th ACL Conf on ANLP, Stuttgart. pp. 53-58.
Notes: Lecture #05 pdf
Additional Readings:
Homework:
Nymble: a High-Performance Learning Name-finder (1997). Daniel M. Bikel, Scott Miller, Richard Schwartz, Ralph Weischedel. Proceedings of ANLP-97.
Notes: Lecture #06 pdf
Additional Readings:
Other Applications of HMMs:
We will look at the use of the EM algorithm (a generalization of the forward-backward algorithm for HMMs) and apply it to the problem of finding the appropriate interpolation weights between a word-based and a part-of-speech based language model.
mixture.pl is a simple Perl script that implements this idea.
Additional Readings:
Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging (1995). Eric Brill. Computational Linguistics, volume 21, number 4, pp. 543-565.
Notes: Lecture #07 pdf
Additional Readings:
Homework:
A simple introduction to maximum entropy models for natural language processing (1997). Adwait Ratnaparkhi. Technical Report 97-08, Institute for Research in Cognitive Science, University of Pennsylvania.
A Maximum Entropy Model for Part-of-Speech Tagging (1996). Adwait Ratnaparkhi. In Proceedings of the Conference on Empirical Methods in Natural Language Processing. pp. 133-142.
Notes: Lecture #08 pdf
Additional Readings:
Homework:
Automatic Extraction of Subcategorization from Corpora (1997). Ted Briscoe and John Carroll. In Proceedings of the 5th Conference on Applied Natural Language Processing (ANLP-97).
Notes: Lecture #09 pdf
Additional Readings:
Homework:
Minimally supervised morphological analysis by multimodal alignment (2000). Yarowsky, D. and R. Wicentowski. In Proceedings of ACL-2000, pages 207-216.
Unsupervised Learning of the Morphology of a Natural Language (2001). John Goldsmith. Computational Linguistics, Volume 27, Number 2.
Notes: Lecture #10 pdf
Additional Readings:
Supertagging: An Approach to Almost Parsing (1999). Srinivas Bangalore and Aravind K. Joshi. Computational Linguistics, volume 25, number 2, pages 237-265.
Notes: Lecture #11 pdf
Additional Readings:
Coping with syntactic ambiguity or how to put the block in the box on the table (1982). Kenneth Church and Ramesh Patil. Computational Linguistics 8:139-49.
Prepositional Phrase Attachment through a Backed-Off Model (1995). Michael Collins and James Brooks. Proceedings of the Third Workshop on Very Large Corpora WVLC-95.
Notes: Lecture #12 pdf
Additional Readings:
Structural Ambiguity and Lexical Relations (1993). Donald Hindle and Mats Rooth. Computational Linguistics. Volume 19, Number 1, March 1993, Special Issue on Using Large Corpora: I.
Statistical Models for Unsupervised Prepositional Phrase Attachment (1998). Adwait Ratnaparkhi. In Proceedings of COLING-ACL 1998.
Notes: Lecture #13 pdf
Additional Readings:
Head-Driven Statistical Models for Natural Language Parsing. Michael Collins. PhD Dissertation, University of Pennsylvania, 1999. Read chapters 2 and 3, pages 31-102
Statistical parsing with an automatically-extracted tree adjoining grammar (2000). David Chiang. In Proceedings of ACL 2000, Hong Kong, October 2000, pages 456-463.
Notes: Lecture #14 pdf and additional slides (from Michael Collins' thesis presentation)
Additional Readings:
Code:
Inside-Outside Reestimation from partially bracketed corpora. Fernando Pereira and Yves Schabes. In 30th Annual Meeting of the Association for Computational Linguistics, pages 128-135, Newark, Delaware, 1992.
Applications of stochastic context-free grammars using the Inside-Outside algorithm. K. Lari and S. J. Young. Computer Speech and Language, 4:35-56, 1990.
Additional Readings:
Optional:
Combining Labeled and Unlabeled Data with Co-training. Avrim Blum and Tom Mitchell. In Proc. of the Workshop on Computational Learning Theory (COLT98). 1998.
Analyzing the Effectiveness and Applicability of Co-training. Kamal Nigam and Rayid Ghani. In Ninth International Conference on Information and Knowledge Management (CIKM-2000), pp. 86-93. 2000.
Additional Readings:
Committee-Based Sample Selection for Probabilistic Classifiers. Shlomo Argamon-Engelson and Ido Dagan. in Journal of Artificial Intelligence Research, 1999.
On Minimizing Training Corpus for Parser Acquisition. Rebecca Hwa. In Proc. of Workshop on Computational Natural Language Learning. 2001.
Additional Readings:
A short introduction to boosting . Y. Freund and R. Schapire. Journal of the Japanese Society for Artificial Intelligence. 14(5), pages 771-780, 1999.
Boosting Applied to Tagging and PP Attachment . Steven Abney, Robert E. Schapire, and Yoram Singer. Proceedings of the 1999 Joint SIGDAT Conference on Empirical Methods in Natural Language Processing and Very Large Corpora, pp. 38-45. 1999.
Additional Readings:
perl perlsh.pl
and copy/paste code from the tutorial into the session).