Methods of Linguistic Data Collection
語料收集方法
Fall 2016         Thursday 14:10-17:00            文學院413

課碼: 1309004

 

UPDATED 2016/12/26

 

Research and writing resources

 

James Myers (麥傑)
Office:
文學院247
Tel: 31506
Email: Lngmyers at the university address
Web: http://www.ccunix.ccu.edu.tw/~lngmyers/
Office hours: Tuesday 2 pm - 4 pm, or by appointment (made at least one day ahead)

 

Goals

 

Linguists study many kinds of data, and they collect them with many kinds of methods. This class will survey the main types, from fieldwork through corpus analysis to experimentation, in the hope that you'll be inspired to try something new. Along the way we'll also discuss research ethics, linguistic argumentation, and statistics. The last few weeks of the class are all about your own research and the new methods that you especially want to learn about.

 

Evaluation

 

30% Participation (saying interesting stuff in the discussions)

20% Discussion leading (one to two leaders per week)

20% Exercises (practice using various methods)

20% Presentation of your own research (like a conference talk: 15 minutes for presentation + 12 minutes for discussion)

10% Journal submission (editor's receipt of manuscript; may differ from in-class presentation)

 

Readings

 

Antaki, C., Billig, M.G., Edwards, D. and Potter, J.A., (2003). Discourse analysis means doing analysis: A critique of six analytic shortcomings. Discourse Analysis Online, 1(1). http://extra.shu.ac.uk/daol/articles/v1/n1/a1/antaki2002002-paper.html.

Biemann, C., Bildhauer, F., Evert, S., Goldhahn, D., Quasthoff, U., Schäfer, R., J. Simon, L. Swiezinski, & T. Zesch. (2013). Scalable construction of high-quality web corpora. Journal for Language Technology and Computational Linguistics, 28(2), 23-59.

Biq, Y.-O. (1988). From focus in proposition to focus in speech situation: Cai and jiu in Mandarin Chinese. Journal of Chinese Linguistics, 16 (1), 72-107.

Bowern, C. (2010). Fieldwork and the IRB: A snapshot. Language, 86(4), 897-905.

Corbin, J., & Strauss, A. (1990). Grounded theory research: Procedures, canons, and evaluative criteria. Qualitative Sociology, 13 (1), 3-21.

Derwing, B. L., de Almeida, R. G., & Hall, A. (2009). Non-chronometric experiments in linguistics. In D. Eddington (Ed.). Experimental and quantitative linguistics. Munich: Lincom.

DiPersio, D. (2014). Linguistic fieldwork and IRB human subjects protocols. Language and Linguistics Compass, 8(11), 505-511.

Fischer, S. (2009). Sign language field methods: Approaches, techniques, and concerns. In J. H-Y. Tai & J. Tsay (Eds.) Taiwan Sign Language and beyond (pp. 1-19). The Taiwan Institute for the Humanities, National Chung Cheng University.

Fisher, M., Goddu, M. K., & Keil, F. C. (2015). Searching for explanations: How the Internet inflates estimates of internal knowledge. Journal of Experimental Psychology: General, 144(3), 674-687.

Fisher, R. (1926). The arrangement of field experiments. Journal of the Ministry of Agriculture of Great Britain, 33, 503-513.

Granger, S. (2002). A bird's-eye view of computer learner corpus research. In S. Granger, J. Hung, & S. Petch-Tyson (Eds.) Computer learner corpora, second language acquisition and foreign language teaching (pp. 3-33). Amsterdam, Netherlands: John Benjamins.

Gries, S. Th. (2005). Syntactic priming: A corpus-based approach. Journal of Psycholinguistic Research, 34(4), 365-399.

Harwood, N. (2009). An interview-based study of the functions of citations in academic writing across two disciplines. Journal of Pragmatics, 41, 497-518.

Hasko, V. (2012). Qualitative corpus analysis. In C. A. Chapelle (Ed.) The encyclopedia of applied linguistics. Wiley.

Herring, S. C., & Paolillo, J. C. (2006). Gender and genre variation in weblogs. Journal of Sociolinguistics, 10(4), 439-459.

Johnston, T. (2010). From archive to corpus: transcription and annotation in the creation of signed language corpora. International Journal of Corpus Linguistics, 15(1), 106-131.

Ke, C.-L., & Cheng, S.-T. (2014). The acoustic characterization of Taiwanese tones: F0 profiles and time-normalization. Journal of Taiwanese Vernacular, 6 (2), 86-105.

Kertész, A., & Rákosi, C. (2009). Cyclic vs. circular argumentation in the Conceptual Metaphor Theory. Cognitive Linguistics, 20(4), 703-732.

Keuleers, E., & Balota, D. A. (2015). Megastudies, crowdsourcing, and large datasets in psycholinguistics: An overview of recent developments. The Quarterly Journal of Experimental Psychology, 68 (8), 1457-1468.

Myers, J. (2009). The design and analysis of small-scale syntactic judgment experiments. Lingua, 119, 425-444.

Myers, J. (2016a). Data analysis software: Excel and R. Chapter 2 in Yet another statistics for linguists book. National Chung Cheng University ms.

Myers, J. (2016b). Comparing two continuous variables: t tests and beyond. Chapter 6 in Yet another statistics for linguists book. National Chung Cheng University ms.

Myers, J. (2016c). Comparing category sizes: Chi-squared and related tests. Chapter 7 in Yet another statistics for linguists book. National Chung Cheng University ms.

Myers, J., & Tsay, J. (2015). Trochaic feet in spontaneous spoken Southern Min. In H. Tao, Y.-H. Lee, D. Su, K. Tsurumi, W. Wang, & Y. Yang (Eds.), Proceedings of the 27th North American Conference on Chinese Linguistics, Vol. 2, 368-387. Los, Angeles: UCLA.

Neidle, C., Kegl, J., MacLaughlin, D., Bahan, B., & Lee, R. G. (2001). Methodological considerations. Chapter 2 of The syntax of American Sign Language: Functional categories and hierarchical structure (pp. 7-25). Cambridge, MA: MIT Press.

Peterson, G. E., & Barney, H. L. (1952). Control methods used in a study of the vowels. The Journal of the Acoustical Society of America, 24(2), 175-184.

Piper, H. B. (1957). Omnilingual. Astounding Science Fiction, 58(6), 8-46.

Poldrack, R. A. (2006). Can cognitive processes be inferred from neuroimaging data? Trends in Cognitive Sciences, 10(2), 59-63.

Rau, D. V., Yang, M.-C., Chang, A. H.-H., & Dong, M.-N. (2009). Online dictionary and ontology building for Austronesian languages in Taiwan. Journal of Language Documentation and Conservation, 3(2), 192-212.

Ruan, J.-C., Hsu, C.-W., Myers, J., & Tsay, J. S. (2012). Development and testing of transcription software for a Southern Min spoken corpus. International Journal of Computational Linguistics and Chinese Language Processing, 17(1), 1-26.

Singleton, J. L., Martin, A. J., & Morgan, G. (2015). Ethics, Deaf-friendly research, and good practice when studying sign languages. In E. Orfanidou, B. Woll, & G. Morgan (Eds.), Research methods in sign language studies: A practical guide (pp. 7-20). John Wiley & Sons.

Sternberg, S. (1967). Two operations in character recognition: Some evidence from reaction-time measurements. Perception & Psychophysics, 2(2), 45-53.

Supalla, T. (2006). Sign language archeology: Integrating historical linguistics with fieldwork on young sign languages. In R. M. de Quadros (Ed.) Proceedings of The 9th Theoretical Issues in Sign Language Research Conference (pp. 575-583). Petrópolis, Brazil: Editora Arara Azul.

Taft, M., Zhu, X.-P., & Peng, D.-L. (1999). Positional specificity of radicals in Chinese character recognition. Journal of Memory and Language, 40, 498-519.

Warner, N. (2014). Sharing of data as it relates to human subjects issues and data management plans. Language and Linguistics Compass, 8(11), 512-518.

 

Schedule [there may be changes along the way]

 

Readings and exercises must be done prior to class. Readings and exercises appear in the schedule on the day they are due. Discussion leaders should actively guide the discussion, using a handout of questions that encourage us to explore the ideas in the readings. Don't lecture us: The more you encourage other people to express their thoughts, the better!

 

*Marks deadlines relating to your own research

Week

Topic

Reading

Leaders

Exercise

9/15

NO CLASS [中秋節]

 

 

 

9/22

Why worry about methods?

Piper (1957)

Myers

 

9/29

Linguistic argumentation

Fisher et al. (2015)
Kertész & Rákosi (2009)

Myers

 

10/6

Ethics in data collection

Academia Sinica (2016)
Bowern (2010)
DiPersio (2014)
Singleton et al. (2015)
Warner (2014)

蔡佩雯

吳珮文

沈建名

王學為

吳珮文

 

10/13

NO CLASS (James in HK)1

 

 

10/20

Fieldwork

Neidle et al. (2001)

Fischer (2009)
Rau et al. (2009)

王學為

王學為

沈建名

10/27

Creating corpora

Biemann et al. (2013)
Johnston (2010)
Ruan et al. (2012)

阮維東

 

11/3

Qualitative corpus analysis

Antaki et al. (2003)
Hasko (2012)
Granger (2002)

沈建名

阮維東

蔡佩雯

Exercise 1:
Fieldwork

11/10

Statistics

Myers (2016a)
Myers (2016b)
Myers (2016c)

王涵德

 

11/17

Quantitative corpus analysis

Gries (2005)
Herring & Paolillo (2006)
Myers & Tsay (2015)

蔡佩雯

沈建名

王涵德

 

11/24

Experimental design

Fisher (1926)
Keuleers & Balota (2015)
Myers (2009)

王學為

吳珮文

吳珮文

Exercise 2:
Statistics

12/1

Experimental procedures

 

*Last day to distribute your selected reading to the class

Derwing & de Almeida (2009)
Peterson & Barney (1952)
Poldrack (2006)
Sternberg (1967)

蔡佩雯

沈建名

蔡佩雯

吳珮文

 

12/8

Your choice

Harwood (2009)
Taft et al. (1999)

阮維東
蔡佩雯

Exercise 3:
Experiments

12/15

*NO CLASS (James in Vietnam)2

 

12/22

Your choice

Corbin & Strauss (1990)
Supalla (2006)

吳珮文
王學為

 

12/29

Your choice

Biq (1988)
Ke & Cheng (2014)

王涵德
沈建名

 

1/5

*Presentations [last class]

 

 

 

1/12

*Journal receipt due

 

 

 

1 To make up this class, you must visit me in my office to discuss the class, your own research, and/or your career plans, any time this semester.

2 To make up this class, you have to do the following between 12/9 and 1/3: (1) Email everybody a 100-word English abstract of your own research; (2) email brief English comments or questions about the abstracts to everybody (so we can all see them). I will also email comments on all the abstracts, but this activity is ungraded.