Advanced Statistics

高級統計學

Fall 2008             Tuesday 9:10-12:00          文學院413

 

STATISTICS RESOURCES

 

Discussion leader:

 

James Myers (麥傑)

Office: 文學院247

Tel: x31506

Email: Lngmyers at ccu dot edu dot tw

Office hours: Monday 3-5 pm, or by appointment

 

Goals:

 

In this class we'll explore the new statistical methods (and the new philosophy behind them) that are quickly becoming standard in quantitative linguistics, through lots of hands-on work with R, the widely used free statistics package. All of the data we'll analyze are real, including your own.

 

Textbook:

 

Baayen, R. H. (2008). Analyzing linguistic data: A practical introduction to statistics using R. Cambridge University Press. [B08]

 

Grading:

 

40%       Class participation

30%       Minipaper 1 (due 11/18)

30%       Minipaper 2 (due 1/13)

 

       This isn't a lecture class, so your participation is crucial. Do each week's reading before class, trying out the examples in the text. Take notes on problems you faced during the reading (e.g., the new concepts, or crashing R code) and we'll discuss them in class. We'll try out the crucial in-text examples together, supplemented occasionally with other data or R code. Then we'll try out the workbook exercises - answers are in the back of the book, but I'll be eager to see how clearly you can demonstrate and explain them!

       The minipapers are short (five pages or so) technical reports describing statistical analyses of your own data (old or new), with just enough background and interpretation to make the linguistic relevance clear. The focus is on demonstrating what you've learned in the preceding half semester.


Schedule

"*" marks when minipapers are due

 

Week

Topics

Readings in B08

9/16

Welcome!

 

9/23

A (re)introduction to R

1

9/30

Graphical data exploration

2

10/7

Probability distributions

3

10/14

One-sample tests

4.1-4.2

10/21

Two-sample tests

4.3

10/28

ANOVA and count data

4.4-4.6

11/4

Clustering

5.1

11/11

Classification

5.2

*11/18

Minipaper 1 due

 

11/25

Introduction to regression

6.1-6.2.2

12/2

Evaluating models and generalized linear models

6.2.3-6.3.2

12/9

Regression with breakpoints and corpus analysis

6.4-6.6

12/16

Introduction to mixed-effects modeling

7.1-7.2.1

12/23

More about (nonlinear) mixed-effects modeling

7.2.2-7.5.1

12/30

Applications of mixed-effects modeling

7.5.2-7.5.4

1/6

General discussion [last class]

 

*1/13

Minipaper 2 due [my mailbox, by 5 pm]

 

 

Other resources on R

 

Chambers, J. M. (2008). Software for data analysis: Programming with R. Springer.

Crawley, M. J. (2005). Statistics: An introduction using R. Wiley.

Crawley, M. J. (2007). The R book. Wiley.

Dalgaard, P. (2002). Introductory statistics with R. Springer.

Everitt, B. S., & Hothorn, T. (2006). A handbook of statistical analyses using R. Chapman & Hall/CRC.

Johnson, K. (2008). Quantitative methods in linguistics. Blackwell.

Maindonald, J., & Braun, J. (2006). Data analysis and graphics using R: An example-based approach (2nd ed.). Cambridge University Press.

Vasishth, S., & Broe, M. (2008). The foundations of statistics: A simulation-based approach. University of Potsdam & Ohio State University ms. http://www.ling.uni-potsdam.de/~vasishth/SFLS.html

Verzani, J. (2004). Using R for introductory statistics. Chapman & Hall/CRC.

 

R project: http://www.r-project.org/

R wiki: http://wiki.r-project.org/

R-lang mailing list: http://pidgin.ucsd.edu/mailman/listinfo/r-lang

R Commander: http://socserv.mcmaster.ca/jfox/Misc/Rcmdr/

R code writer (barely started!): http://www.ccunix.ccu.edu.tw/~lngproc/RCodeWriter.htm