Advanced Statistics
高級統計學
Fall 2008 Tuesday 9:10-12:00 文學院413
Discussion leader:
James Myers (麥傑)
Office: 文學院247
Tel: x31506
Email: Lngmyers at ccu dot edu dot tw
Office hours: Monday 3-5 pm, or by appointment
Goals:
In this class we'll explore the new statistical methods (and the new philosophy behind them) that are quickly becoming standard in quantitative linguistics, through lots of hands-on work with R, the widely used free statistics package. All of the data we'll analyze are real, including your own.
Textbook:
Baayen, R. H. (2008). Analyzing linguistic data: A practical introduction to statistics using R. Cambridge University Press. [B08]
Grading:
40% Class participation
30% Minipaper 1 (due 11/18)
30% Minipaper 2 (due 1/13)
This isn't a lecture class, so your participation is crucial. Do each week's reading before class, trying out the examples in the text. Take notes on problems you faced during the reading (e.g., the new concepts, or crashing R code) and we'll discuss them in class. We'll try out the crucial in-text examples together, supplemented occasionally with other data or R code. Then we'll try out the workbook exercises - answers are in the back of the book, but I'll be eager to see how clearly you can demonstrate and explain them!
The minipapers are short (five pages or so) technical reports describing statistical analyses of your own data (old or new), with just enough background and interpretation to make the linguistic relevance clear. The focus is on demonstrating what you've learned in the preceding half semester.
Schedule
"*" marks when minipapers are due
Week |
Topics |
Readings in B08 |
9/16 |
Welcome! |
|
9/23 |
A (re)introduction to R |
1 |
9/30 |
Graphical data exploration |
2 |
10/7 |
Probability distributions |
3 |
10/14 |
One-sample tests |
4.1-4.2 |
10/21 |
Two-sample tests |
4.3 |
10/28 |
ANOVA and count data |
4.4-4.6 |
11/4 |
Clustering |
5.1 |
11/11 |
Classification |
5.2 |
*11/18 |
Minipaper 1 due |
|
11/25 |
Introduction to regression |
6.1-6.2.2 |
12/2 |
Evaluating models and generalized linear models |
6.2.3-6.3.2 |
12/9 |
Regression with breakpoints and corpus analysis |
6.4-6.6 |
12/16 |
Introduction to mixed-effects modeling |
7.1-7.2.1 |
12/23 |
More about (nonlinear) mixed-effects modeling |
7.2.2-7.5.1 |
12/30 |
Applications of mixed-effects modeling |
7.5.2-7.5.4 |
1/6 |
General discussion [last class] |
|
*1/13 |
Minipaper 2 due [my mailbox, by 5 pm] |
|
Other resources on R
Chambers, J. M. (2008). Software for data analysis: Programming with R. Springer.
Crawley, M. J. (2005). Statistics: An introduction using R. Wiley.
Crawley, M. J. (2007). The R book. Wiley.
Dalgaard, P. (2002). Introductory statistics with R. Springer.
Everitt, B. S., & Hothorn, T. (2006). A handbook of statistical analyses using R. Chapman & Hall/CRC.
Johnson, K. (2008). Quantitative methods in linguistics. Blackwell.
Maindonald, J., & Braun, J. (2006). Data analysis and graphics using R: An example-based approach (2nd ed.). Cambridge University Press.
Vasishth, S., & Broe, M. (2008). The foundations of statistics: A simulation-based approach. University of Potsdam & Ohio State University ms. http://www.ling.uni-potsdam.de/~vasishth/SFLS.html
Verzani, J. (2004). Using R for introductory statistics. Chapman & Hall/CRC.
R project: http://www.r-project.org/
R wiki: http://wiki.r-project.org/
R-lang mailing list: http://pidgin.ucsd.edu/mailman/listinfo/r-lang
R Commander: http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/
R code writer (barely started!): http://www.ccunix.ccu.edu.tw/~lngproc/RCodeWriter.htm