Mathematical Linguistics 數理語言學
Spring
2023 Thursday
(四)
14:10-17:00 文學院413
課碼: 1306564
UPDATED 2023/04/27
Me
James Myers
(麥傑)
Office: 文學院247
Tel: 31506
Email: Lngmyers at the university address (ccu...)
Office hours: Wednesday 10-12, or by appointment (made at least 24 hours ahead)
Goals
“Mathematical linguistics” means different things to different people. This version of the course has a particular focus on models of learning, as we compare two main approaches: rationalist models (proofs and theorems) like formal learning theory and empiricist models (just try whatever works) like neural network modeling. By the end of the semester, you should feel much more comfortable thinking in a formally precise way about grammar, corpus analysis, language acquisition, and psycholinguistics.
Grading
10% Class
participation
30% Leading discussion
40% Exercises
20% Research presentation
What the class is like
Rather than passively listening to lectures from a textbook (which doesn’t exist for this version of the course anyway), we will read and discuss classic and recent papers together. I purposely chose readings with lots of math in them (of course!), so read them in a “top-down” way: focus on each paper’s main claims instead of getting stuck on tiny details, though you should still feel free to ask about anything in class. Since these are math papers, the main claims will involve technical concepts and formulas: don’t skip them! Hopefully the authors were nice enough to make the math figure-out-able just by carefully studying the paper itself, but if you think it’s still unclear, it might be their fault, not yours!
Class participation means that you discuss: you read, think, talk, and respond to others’ ideas. Don’t be afraid to ask for clarification - that’s also part of the discussion.
Every week somebody will lead the discussion on the week’s readings, using a handout as a guide. The discussion leader should NOT lecture us or search the internet for related information, but instead help us understand the reading and its real-world relevance by asking open-ended questions that inspire people to get involved and express what they think. Please be sure to post a PDF file of your questions to the E-Course “discussion” section by 12 noon on the day of class, so everybody has time to download (and maybe print) it before class.
In order to get hands-on experience with some of the technical methods that we will read about, there will be two take-home exercises (due on 4/13 and 5/18). Each exercise will be distributed two weeks before it is due.
On 5/11, about a month before the end of the semester, you will propose an original research project of your own, applying the mathematical models discussed in class. This may involve theoretical analyses, using an existing computer program, and/or writing your own new program. On the last day of class (6/8), you’ll give a presentation about your research findings, which I’ll grade for style, logic, and theory. (If you can’t attend that day, send me your presentation file and I’ll present for you.) There is NO TERM PAPER (yay).
WARNING #1: Plagiarism (pretending that other people’s words and ideas are your own) is a serious crime and will not be tolerated. Homework or other graded things containing plagiarism will receive a score of zero, and you will be reported to the department chair.
WARNING #2: Submit your homework and other graded things on time! Unless you have a really good excuse, you will lose 5 points for each day you are late. So don’t make yourself sick working overnight, but get your stuff done early enough.
Schedule
*Marks when something is due
Date |
Topic/Activity |
Readings |
Leader |
2/16 |
What should linguists know about math? |
|
|
2/23 |
Information theory basics |
Shannon (1948) [§0-10] Rioul (2018) [§1-14] |
Myers |
3/2 |
Information theory and psycholinguistics |
Hale
(2016) |
Sylvia |
3/9 |
Formal language theory |
Fitch & Friederici (2012) |
Sabrina |
3/16 |
Formal learning theory |
Heinz (2016) |
又睿 |
3/23 |
Bayesian learning models |
Perfors et al. (2011) |
JR |
3/30 |
Bayesian data analysis [Distribute Exercise 1] |
Vasishth et al. (2018) |
Myers |
4/6 |
NO CLASS [校際活動] |
|
|
*4/13 |
Neural network basics [Exercise 1 due] |
Abdi (1994) |
Sam |
4/20 |
Neural networks vs. real brains |
Lillicrap et al. (2020) Schaeffer et al. (2022) |
Elaine Sabrina |
4/27 |
Neural network models of language in time |
Elman (1990) Linzen et al. (2016) |
Sylvia JR |
5/4 |
Neural network models of written language [Distribute Exercise 2] |
Lane et al. (2019) Hannagan et al. (2021) |
JR Sylvia |
*5/11 |
Discuss your research progress |
|
|
*5/18 |
Maximum entropy models [Exercise 2 due] |
Hayes (2022) |
Elaine |
5/25 |
Linear discriminative learning |
Baayen et al. (2018) |
Sabrina |
6/1 |
Modeling language evolution |
Kirby & Tamariz (2022) Lazaridou & Baroni (2020) |
又睿 又睿 |
*6/8 |
Presentations [last class] |
|
|
Readings
Abdi, H. (1994). A neural network primer. Journal of Biological Systems, 2(03), 247-281. [Sections 0-4]
Baayen, R. H., Chuang, Y. Y., & Blevins, J. P. (2018). Inflectional morphology with linear mappings. The Mental Lexicon, 13(2), 230-268.
Elman, J. L. (1990). Finding structure in time. Cognitive Science, 14(2), 179-211.
Fitch, W. T., & Friederici, A. D. (2012). Artificial grammar learning meets formal language theory: an overview. Philosophical Transactions of the Royal Society B: Biological Sciences, 367(1598), 1933-1955.
Hale, J. (2016). Information‐theoretical complexity metrics. Language and Linguistics Compass, 10(9), 397-412.
Hannagan, T., Agrawal, A., Cohen, L., & Dehaene, S. (2021). Emergence of a compositional neural code for written words: Recycling of a convolutional neural network for reading. Proceedings of the National Academy of Sciences, 118(46), e2104779118.
Hayes, B. (2022). Deriving the wug-shaped curve: A criterion for assessing formal theories of linguistic variation. Annual Review of Linguistics, 8, 473-494.
Heinz, J. (2016). Computational theories of learning and developmental psycholinguistics. In J. Lidz, W. Snyder, and J. Pater (ed.) (Eds.) The Cambridge handbook of developmental linguistics (pp. 633-663). Cambridge, UK: Cambridge University Press.
Kirby, S., & Tamariz, M. (2022). Cumulative cultural evolution, population structure and the origin of combinatoriality in human language. Philosophical Transactions of the Royal Society B, 377(1843), 20200319.
Lane, H., Howard, C., & Hapke, H. (2019). Chapter 7: Getting words in order with convolutional neural networks (CNNs). In Natural language processing in action: Understanding, analyzing, and generating text with Python (pp. 218-246). Shelter Island, NY: Manning.
Lazaridou, A., & Baroni, M. (2020). Emergent multi-agent communication in the deep learning era. arXiv preprint arXiv:2006.02419.
Lillicrap, T. P., Santoro, A., Marris, L., Akerman, C. J., & Hinton, G. (2020). Backpropagation and the brain. Nature Reviews Neuroscience, 21(6), 335-346.
Linzen, T., Dupoux, E., & Goldberg, Y. (2016). Assessing the ability of LSTMs to learn syntax-sensitive dependencies. Transactions of the Association for Computational Linguistics, 4, 521-535.
Mollica, F., & Piantadosi, S. T. (2019). Humans store about 1.5 megabytes of information during language acquisition. Royal Society Open Science, 6(3), 181393.
Perfors, A., Tenenbaum, J. B., Griffiths, T. L., & Xu, F. (2011). A tutorial introduction to Bayesian models of cognitive development. Cognition, 120(3), 302-321.
Rioul, O. (2018). This is it: A primer on Shannon’s entropy and information. In B. Duplantier and V. Rivasseau (Ed.s) Information Theory (pp. 49-86). Birkhäuser. [Sections 1-14]
Schaeffer, R., Khona, M., & Fiete, I. (2022). No free lunch from deep learning in neuroscience: A case study through models of the entorhinal-hippocampal circuit. 2nd AI4ScienceWorkshop at the 39th International Conference on Machine Learning.
Shannon, C. E. (1948). A mathematical theory of communication. The Bell System Technical Journal, 27(3), 379-423. [Sections 0-10]
Vasishth, S., Nicenboim, B., Beckman, M. E., Li, F., & Kong, E. J. (2018). Bayesian data analysis in the phonetic sciences: A tutorial introduction. Journal of Phonetics, 71, 147-161.
Interesting links
Programming languages
* Excel: easy to use for many types of calculations <1000s of websites online>
* R: powerful statistics programming language <https://cran.r-project.org/>
* Python: most widely used general programming language <https://www.python.org/>
Information theory
* Shannon entropy calculator <https://www.shannonentropy.netmark.pl/>
* Maxent Grammar Tool <https://linguistics.ucla.edu/people/hayes/MaxentGrammarTool/>
Formal language theory
* Automaton Simulator <https://automatonsimulator.com/>
* JFLAP (for downloading) <https://www.jflap.org/>
Bayesian models
* Simple Bayes calculator <https://psych.fullerton.edu/mbirnbaum/bayes/bayescalc.htm>
* Stan <https://mc-stan.org/>
Neural networks
* Neural networks videos (among many many others)
- Overview: <https://www.youtube.com/watch?v=pdNYw6qwuNc>
- LSTM vs. transformers: <https://www.youtube.com/watch?v=S27pHKBEp30>
- Non-video but picture-based explanation of the same things:
<https://colah.github.io/posts/2015-08-Understanding-LSTMs/>
<https://jalammar.github.io/illustrated-transformer/>
- Convolution: <https://www.youtube.com/watch?v=-QQML5kf26Q>
* Neural network simulator <https://www.mladdict.com/neural-network-simulator>
* Online demo of convolutional network learning
to read
<https://cs.stanford.edu/people/karpathy/convnetjs/demo/mnist.html>
* TensorFlow <https://www.tensorflow.org/>
- Online interface for playing <https://playground.tensorflow.org/>
- Keras: user-friendly interface for programming <https://keras.io/>
- R interface for programming <https://tensorflow.rstudio.com/>
* Software for linear discriminative learning
<https://sfs.uni-tuebingen.de/~hbaayen/software.html>
Language evolution
* Language Evolution Simulation (simulates word
coinage)
<https://rmeertens.github.io/language-evolution-simulation/>
* Onset (simulates historical sound change) <https://onset.cadel.me/>
* Color Game: mobile app that was used to study
human creation of new languages
<https://colorgame.net/>