Mathematical Linguistics - Spring 2013

Mathematical Linguistics
數理語言學
Spring 2013             Thursday () 14:10-17:00             文學院412

課碼: 1306564

 

UPDATED 2013/5/10

 

Software links

Me

James Myers (麥傑)
Office:
文學院247
Tel: 31506
Email: Lngmyers at the usual place
Web: http://www.ccunix.ccu.edu.tw/~lngmyers/
Office hours: Thursday 10 am to 12 noon, or by appointment

 

Goals

"Mathematical linguistics" means different things to different people. This version of the course focuses on how formal language theory (and other mathematical or computational models) can help linguists better understand real human grammar. By the end of the semester, you should feel much more comfortable thinking in a formally precise way about grammar, corpus analysis, language acquisition, psycholinguistic processing, and language evolution.

 

Grading

10% Class participation
30% Leading discussion
20% Exercises
40% Term paper

 

What the class is like

        Rather than passively listening to lectures from a textbook (which doesn't exist for this version of the course anyway), we will read and discuss classic and recent papers together. There are usually a lot of pages to read each week and the papers are often quite technical, so focus on the big picture instead of getting stuck on little details. When reading, always ask yourself three basic questions: What is the paper's main claim? How convincingly does the paper argue for the main claim? What is the most interesting aspect of the paper for you?

        Class participation means that you discuss: you read, think, talk, and respond to others' ideas. Don't be afraid to ask for clarification - that's also part of the discussion.

        Every week somebody will lead the discussion on the week's readings, using a handout as a guide. The discussion leader should NOT summarize the paper or ask simple comprehension questions, but instead ask open-ended questions to help people focus on the crucial points in the readings, and to inspire people to get involved and express what they think.

        In order to get hands-on experience with some of the technical methods we will read about, there will be one set of take-home exercises due on 5/9. The exercises will be distributed at different times, so you'll have more than one week to finish the whole thing.

        On 5/16, about a month before the end of the semester, you will propose an original research topic of your own, applying the mathematical or computational models discussed in class to a grammatical issue. This may involve using somebody else's program, or writing your own. On the last day of class (6/13), you'll give an informal, ungraded presentation about your research. The paper (about 20 pages, in English) is due a week later (6/20), as a PDF file, sent via email, plus your program (if any). I'll grade the paper in the usual way (style, logic, theory).


Schedule (Some of these readings may change!)

*Marks when something is due

 

Date

Topic/Activity

Readings

Leader

2/21

How can math help explain grammar?

 

 

2/28

NO CLASS [228]

 

 

3/7

Formal languages: Basics

Myers (2013a)
Hausser (1999)
Jurafsky & Martin (2009)

Myers

3/14

Formal languages: Phonology & morphology

Hammond (1993)
Culy (1985)
Bird & Ellison (1994)

楊振宗

3/15

Preview of phonological acquisition models

Mary Beckman's talk

 

3/21

Formal languages: Syntax

Free discussion of Mary Beckman's talk (no discussion leader)

Higginbotham (1984)
Pullum (1984)
Stabler (2010)

楊振宗

3/28

Formal languages: Corpus analysis

Corley et al. (2001)
Meurers (2004)

江欣尠

4/4

NO CLASS [清明節]

 

 

4/11

Formal learning theory

Heinz (2012)
Osherson et al. (1989)

吳怡欣

4/18

Connectionism: Basics

Myers (2013b)
Medler (1998)

Myers

4/25

Connectionism: Limitations

Pinker & Ullman (2002)
Bowers et al. (2009)

江欣尠

5/2

Optimality and harmony

Potts et al. (2010)
Myers (2012)

Myers

*5/9

Bayesian learning

Myers (2013c)
Perfors et al. (2010)

吳怡欣

*5/16

Discuss paper topics
Exercises due [1, 2, 3, 4] by email before class

 

 

5/23

Processing

Idsardi (2006)
DeVault & Stone (2009)

Myers

?/?

One-on-one project discussions
(during office hours or by appointment)

5/30

NO CLASS [Colin Phillips talk]

 

6/6

NO CLASS [IACL Workshops]

*6/13

Presentations [last class]

 

 

*6/20

TERM PAPERS DUE
(by 5 pm, by email)

 

 

Optional, for-fun, no-time-to-discuss readings:

Evolution: Fitch & Friederici (2012), Griffiths & Kalish (2007)

Processing phonology: Hayes & Wilson (2008)

Processing syntax: Fong (2012)


Readings

Bird, S., & Ellison, T. M. (1994). One-level phonology: Autosegmental representations and rules as finite automata. Computational Linguistics, 20 (1), 55-90.

Bowers, J. S., Damian, M. F., & Davis, C. J. (2009). A fundamental limitation of the conjunctive codes learned in PDP models of cognition: Comment on Botvinick and Plaut (2006). Psychological Review, 116 (4), 986-997

Corley, S., Corley, M., Keller, F., Crocker, M. W., & Trewin, S. (2001). Finding syntactic structure in unparsed corpora: The Gsearch corpus query system. Computers and the Humanities, 35 (2), 81-94.

Culy, C. (1985). The complexity of the vocabulary of Bambara. Linguistics and Philosophy, 8, 345-351.

DeVault, D., & Stone, M. (2009). Learning to interpret utterances using dialogue history. Proceedings of the 12th Conference of the European Chapter of the ACL, 184-192.

Fitch, W. T., & Friederici, A. D. (2012). Artificial grammar learning meets formal language theory: An overview. Philosophical Transactions of the Royal Society B, 367, 1933-1955

Fong, S. (2012). Unification and efficient computation in the Minimalist Program. To appear in L. Francis & L. Laurent (Eds.) Language and recursion. Berlin: Springer.

Griffiths, T. L., & Kalish, M. L. (2007). Language evolution by iterated learning with Bayesian agents. Cognitive Science, 31, 441-480.

Hammond, M. (1993). On the absence of category-changing prefixes in English. Linguistic Inquiry, 24 (3), 562-567.

Hausser, R. (1999). Chapter 8: Language hierarchies and complexity. In Foundations of computational linguistics (pp. 141-162). Berlin: Springer.

Hayes, B., & Wilson, C. (2008). A maximum entropy model of phonotactics and phonotactic learning. Linguistic Inquiry, 39 (3), 379-440.

Heinz, J. (2012). Computational theories of learning and developmental psycholinguistics. Under review for J. Lidz & J. Pater (Eds.) The Cambridge handbook of developmental linguistics. Cambridge, UK: Cambridge University Press.

Higginbotham, J. (1984). English is not a context-free language. Linguistic Inquiry, 15 (2), 225-234.

Idsardi, W. J. (2006). A simple proof that Optimality Theory is computationally intractable. Linguistic Inquiry, 37 (2), 271-275.

Jurafsky, D., & Martin, J. H. (2009). Chapter 2: Regular expressions and automata. In Speech and language processing, second edition (pp. 17-44). Upper Saddle River, NJ: Pearson.

Medler, D. A. (1998). A brief history of connectionism. Neural Computing Surveys, 1 (2), 18-73.

Meurers, W. D. (2004). On the use of electronic corpora for theoretical linguistics: Case studies from the syntax of German. Lingua, 115, 1619-1639.

Myers, J. (2012). Testing phonological grammars with lexical data. In J. Myers (Ed.) In search of grammar: Empirical methods in linguistics (pp. 141-176). Language and Linguistics Monograph Series 48. Taipei, Taiwan: Language and Linguistics.

Myers, J. (2013a). An overview of formal language theory. National Chung Cheng University ms.

Myers, J. (2013b). An overview of connectionism. National Chung Cheng University ms.

Myers, J. (2013c). An overview of statistically inspired learning models. National Chung Cheng University ms.

Osherson, D. N., Stob, M., & Weinstein, S. (1989). Learning theory and natural language. In R. J. Matthews & W. Demopoulos (Eds.) Learnability and linguistic theory (pp. 19-50). Dordrecht: Kluwer Academic Publishers.

Perfors, A., Tenenbaum, J. B., & Wonnacott, E. (2010). Variability, negative evidence, and the acquisition of verb argument constructions. Journal of Child Language, 37, 607-642.

Pinker, S., & Ullman, M. T. (2002). The past and future of the past tense. TRENDS in Cognitive Sciences, 6 (11), 456-463.

Potts, C., Pater, J., Jesney, K., Bhatta, R., & Becker, M. (2010). Harmonic Grammar with linear programming: From linear systems to linguistic typology. Phonology, 27 (1), 77-117.

Pullum, G. K. (1984). On two recent attempts to show that English is not a CFL. Computational Linguistics, 10, (3-4), 182-186.

Stabler, E. P. (2010). Computational perspectives on minimalism. In C. Boeckx (Ed.), The Oxford handbook of linguistic minimalism (pp. 616-641). Oxford: Oxford University Press.

 

Software (to be updated)

 

Programming languages

 

* Excel VBA: Put programs right into your Excel file

* JavaScript: Language used for MiniCorp

* Perl: Popular programming language for dealing with character strings

* Prolog: Popular language in computational linguistics (e.g. by Prof. Wu)

* Python: Popular, simple general-purpose programming language

* R: Programming for statistics, graphics, and string manipulation

* A pro-programming propaganda video

 

Formal language theory

 

* Automaton Simulator

* JFLAP: A free Java program for playing with automata and formal grammars

* Visual Automata Simulator

 

Applications of formal language theory

 

* Gsearch: Corpus analysis program described in Corley et al. (2001)

* MiniCorp: Includes regular expression tool and loglinear modeling

* The Stanford Parser

* Edward Stabler's list of software for minimalist grammars

 

Connectionism

 

* Nengo Neural Simulator

* Neuroph: Another neural net tool

* Praat: Includes a neural net tool

* R connectionist packages (install them within R):

* neuralnet (feedforward networks trained with backpropagation)

* RSNNS (many network types, including recurrent networks)

* tlearn: A simple program that no longer seems to work

 

Other learning models

 

* Harmonic Grammar with Linear Programming: Described in Potts et al. (2010)

* Praat: Includes tools for learning in OT and Harmonic Grammar

 

Language evolution

 

* ALingua