Psycholinguistics James Myers April 9, 2004 The mental lexicon OVERVIEW: 1. Lexical access 2. Morphological structure 3. Models of lexical semantics ============================================================= 1. Lexical access Psychologists study processes; thus the study of the lexicon is the study of how people access (reach) it. 1.1 There are two kinds of access (perception/comprehension vs. production), so maybe there are two lexicons...? For example, Caplan (1993:120-1) points out that: There are brain-damaged patients who are normal in comprehension tasks and normal in picture-naming (production) tasks -- BUT -- they make SEMANTIC errors in repetition (both perception and production): "Eisenhower" (US president in 1950's) --> "Krushchev" (USSR leader in 1950's) This suggests they have a normal comprehension lexicon and a normal production lexicon, but the link between them is damaged. This in turn implies that there are two lexicons! (... or does it...?) 1.2 The effect of modality: reading versus listening The lexicons for reading and listening are probably very different as well. For example, when reading a single Chinese character, readers probably access it all at once, but when listening to a spoken syllable, listeners may instead access it in pieces, over the time they hear the syllable. 1.2.1 For example, Gaygen & Luce (1998) found that subjects were sensitive to whether a word was more familiar from reading or from listening.) 1.2.2 A specific theory about accessing spoken words: the COHORT model. (e.g. Marslen-Wilson 1987): as you hear a spoken word from beginning to end, you activate all words in memory that are consistent with what you've heard so far. This "cohort" of competing words shrinks until only the correct one is left -- which may occur before you have heard the word all the way to the end. For example, if you are listening to the word "elephant", the cohorts are: HEAR: el... ...e... ...ph... COHORT: elk, elevator, elephant! Elvis, elephant elevator, elephant... But the evidence for this mainly comes from off-line GATING experiments, where subjects are played pieces of words, e.g. "el", then "ele", etc, and they are asked to guess what the word is. It is not at all surprising that they end up naming words in "cohorts", but does this really describe what happens in on-line word access? 1.3 Now let's discuss two important, well-established and experimentally useful effects regarding lexical access, and then a third point illustrating these two: Frequency effects Priming effects Is word access modular? 1.3.1 Frequency effects The lexical frequency of a word is how often it is encountered (heard, read, written, and/or spoken). 1.3.1.1 Frequency in Chinese is particularly tricky: What counts as a "word" in Chinese? Which of these examples are single words, and which are two words combined in a phrase? (1) ¶}¤ô (6) ®à¤l (2) ¶}¤M (7) ¤ý¤l (3) ¶}Ãö (8) ­ì¤l (4) ¶}©l (9) ¦Ñ¤H (5) ¶}ªù (10) ¦ÑÁó Chinese also shows very large differences between frequencies for written and spoken language (a problem, since most frequency counts are calculated from written language). For example, ©M is the usual word for "and" in speech, while »P is a very common word for "and" in writing. 1.3.1.2 In spite of such problems, people have pretty reliable intuitions about the frequency of words. Guess which of each pair of words below is more common, and you'll probably be right (you can check your answers on the web at the Academia Sinica corpus site: http://www.sinica.edu.tw/SinicaCorpus/). Å¥¡H ÆU¡H Å¥¨ì¡H ¬Ý¨ì¡H ¤ë¡H ¦~¡H 1.3.1.3 Frequency effects are similar to other perceptual judgments: they follow a logarithmic function (Rubenstein & Pollack, 1963). At low levels, small differences in loudness, brightness, etc are easily perceived; at high levels, small differences are harder to perceive. Likewise, small differences in frequency are easier to perceive for low-frequency words than for high frequency words. 1.3.1.4 Frequency effects in the NAMING task: Subjects see a written word or a picture and must "name" it as quickly as possible. Key measurement: reaction time (RT), also called response time or response latency. RT's can be affected by other factors, of course, e.g. the length of the word (high frequency "banana" vs. low frequency "id") 1.3.1.5 Frequency effects in the LEXICAL DECISION task: Subjects are given a list containing both real words and fake words (nonwords, pseudowords, nonce forms), presented either visually or auditorily; subjects must decide as quickly as possible whether a given item is a real word or not. Subjects are faster to decide if a visually presented string of characters is a real word or not if the real words are higher frequency (assuming that length, etc, is controlled). [Note: some have argued (e.g., Hung, Tzeng, & Ho, 1999) that the fuzzy status of the Chinese "word" makes lexical decision tasks questionable for Chinese.] English example: Try the "experiment" described in Carroll, p. 120. (It might not work if you don't read English a lot!) Chinese example: Time yourself for each list below. Do you find a frequency effect? (from data used in Ahrens 1998) Higher frequency words (mixed with fake words): Ū¥õ ¬Q¤Ñ ¤¤¤ß ±M·~ ¹D¯S «p´ú ¥Ë´µ ¨ý¼M ¯Þ²~ µ¹ÁÚ §ä¨ì °ª¿³ ÆF¸T ³ø®× Lower frequency words (mixed with fake words): »¡²z ¬ü­Ì Án±a ¤åªZ ®È¯u Â÷¥@ ¦Û¦h °Ê²Î »´ÆF ÂåªÌ ¹ê¥R ²§¦P ¥Ö½L »G©É Notice, by the way, how hard it is to make "fake words" in Chinese! Many of the nonwords are in fact impossible, e.g. ¬ü­Ì violates the morphology of Chinese. Moreover, Chinese subjects often aren't sure what you mean by "word" anyway. It's easier to tell them to make a meaningful/nonmeaningful judgment. 1.3.1.6 Frequency is an arbitrary, unpredictable fact about a word, and so must be "stored" somehow in the lexicon. Frequency can therefore be used as a diagnostic of lexicality: if X shows frequency effects, then X must be stored in the lexicon. Example: Sereno and Jongman (1997) Question: Do English speakers generate regular inflection by rule, or is regular inflection included in the lexicon? For example, is the word "dogs" in the lexicon, or only "dog", so that people have to use a rule to add the /s/ when speaking or writing, or subtract the /s/ when listening or reading? (Connectionists say "no rules"; others, like Steven Pinker, say "yes rules".) Task: Visual lexical decision task. Materials: Two kinds of nouns were chosen: nouns whose singular form is more frequent than the plural form (e.g. river-rivers); and nouns whose plural is more frequent than the singular (e.g. window-windows). Procedure: Half of the subjects got a list of singular nouns (plus an equal number of fake words). The other half got a list of plural nouns (plus fake words). Prediction: If regular plurals are stored in the lexicon, then there should be a frequency effect. Specifically, subjects in the SINGULAR condition should be faster with "river" than with "window", while subjects in the PLURAL condition should be faster with "windows" than with "rivers". Results: The prediction was supported. This implies that English users do NOT use a rule to add or subtract plural /s/. 1.3.1.7 Models of lexical access and frequency effects: One obvious but silly model (Forster 1976): words are "listed" in memory from highest frequency to lowest, so it takes longer to search for lower frequency words. ACTIVATION models (includes the old "logogen" model and current connectionist models): each time you encounter a word, the "activation" of its lexical entry is increased, and this activation helps you access it later; thus frequent words are easier to access. Note that both models imply that frequency is an inherent part of storing a word. Thus they predict that no task will show a lexical effect (e.g., real words are faster to respond to than fake words) without also showing a frequency effect. But this isn't true. Naming tasks using English often find that naming real words is faster than naming fake words, but there is still no frequency effect (see e.g. Paap et al. 1987). However, this doesn't seem to be true in Chinese, which implies that the English effect is due to subjects using pronunciation rules which can be used independently of the lexicon. 1.3.2 Priming effects PRIMING: the presentation of one stimulus (the PRIME) affects the speed (usually speeding up) of the response to another stimulus (the TARGET). "X primes Y" means "X is a prime for Y" or "the presentation of X speeds up (or slows down) the response to Y relative to a control" 1.3.2.1 Words prime themselves: subjects are faster to name or make a lexical decision about a word if they've been presented with that same word recently. Words are primed by semantically related words: subjects are faster to make a lexical decision about "BUTTER" if it's preceded by "bread" than if it's preceded by "nurse." 1.3.2.2 How to model priming: By far the best models use a connectionist or spreading-activation approach (e.g. Carroll, p. 115 or p. 116): This model claims that the lexicon consists of a network of related words. If a word is "activated", its activation may spread to words that are linked to it: SPREADING ACTIVATION. Words prime themselves because repeated exposure to a word increases its activation level; thus it will be easier to respond to the next time. Words are primed by semantically related words because of spreading activation. 1.3.2.3 Thus priming effects can be used to study the mental representation of words: if you know what A is like, and you know that A primes B, then you know that B must be represented in the lexicon in a way similar to A. However, only semantic priming is reliable and consistent. Phonological priming and morphological priming are much more complex and are not well understood. In fact, phonological "priming" can actually cause the response to the target to slow down ("inhibitory priming"). (Any guesses why this might be...?) Priming effects can be used to study the time-course of lexical access; e.g. in processing a sentence, what words get activated at what time? (More on this shortly.) 1.3.3 Frequency and priming as tools in the study of modularity Words may be semantically ambiguous (have more than one meaning): "letter" "bug" ¶º¸J ¤¤À\ Frequency effects: common (primary) meanings are more rapidly accessed than uncommon (secondary) meanings. Hogaboam and Perfetti (1975): Subjects given sentences ending in semantically ambiguous words, e.g. "letter". Preceding context consistent either with the primary meaning (1) or with the secondary meaning (2): (1) The jealous husband read the letter. (2) The antique typewriter was missing a letter. Task: Is the last word ambiguous? Results: Subjects are faster with (2). In (1), context just reinforces primary meaning; accessing secondary meaning takes longer. In (2), context reminds subjects of the secondary meaning; primary meaning is accessed quickly. Modularity (?) in the processing of lexical ambiguity, using frequency effects & priming as research tools: Do people access all meanings of a word even if the preceding context makes it clear that only one meaning is relevant? (NOT: "Does the sentential context have NO effect?" Obviously it does eventually -- but is the effect immediate and automatic or not?) Method: The CROSS-MODAL lexical decision task (also known as the cross-modal priming technique): Swinney (1979) Sample auditory stimulus: "Rumor had it that, for years, the government building has been plagued with problems. The man was not surprised when he found several spiders, roaches, and other BUGS in the corner of this room." Sample visual stimuli for lexical decision: ANT SPY SEW Coordination between the two types of stimuli: SEE:.......................................ANT HEAR: ...spiders, roaches, and other bugs ^ in the corner... Results: "bugs" primes both ANT [related to primary meaning] and SPY [related to secondary meaning], regardless of preceding sentential context. Conclusion: English listeners automatically access all meanings of an ambiguous word, independently of sentential context. Other experiments by Swinney show that the effect fades several words later (e.g. in above example, after a few words only ANT will be primed). Ahrens (1998) did a Chinese version of this experiment: ¦pªG¦³¥~°ê«È¤H¨Ó³X¡A§Ú³q±`¤£©Û«Ý¥L­Ì¦Y¦è¦¡¶¼­¹¡A ¦Ó·|½Ð¥L­Ì¦Y¡i¤¤À\¡j¡A³o¼Ë¥L­Ì¤~·|¦³·sAªº·Pı¡A ¨Ã¥B¯d¤U¤ñ¸û²`¨èÃø§Ñªº¸gÅç¡C HEAR: ... ¦Ó·|½Ð¥L­Ì¦Y¤¤À\¡.... SEE: ("primary" condition) ....¤È¶º ("secondary" condition) ....ªF¤è ("control" condition) ....¤åªk ("nonword" condition) ....¬ü­Ì Task: is the visually presented item a µü or not? Here the prime is the auditory word¡i¤¤À\¡j, and the targets are the various visually presented words. Results: same as for English: both "primary" and "secondary" targets are primed relative to the control. This effect is theoretically important, since it implies that the processing of words and the processing of sentences are initially independent of each other. Not everybody likes this "modularity" hypothesis, though, and it hasn't been supported by all researchers (e.g. Tabossi and Zardon 1993, who studied Italian). 2. Morphological structure 2.1 MORPHOLOGICALLY COMPLEX words: words containing more than one morpheme (also called "polymorphemic" words). Examples: banana-s, walk-ing, re-apply, un-happy, anti-dis-establish-ment-arian-ism µf­X¡B¥~°ê¤H¡B¥i·R¡B¤u·~¤Æ¡B¤pªB¤Í­Ì Sometimes morphemes are more difficult to determine: re-sult? sn-eeze? (cf. snot, snore, sneer) ®à¤l¡B½¹½º¡B©_²§ªG (from English "kiwi") 2.2 One of the key questions: Does lexical access involve MORPHOLOGICAL DECOMPOSITION: breaking such words into their morphemes? 2.3 Many experiments of many kinds address morphological processing. We already saw one above (Sereno and Jongman 1997). Here are some more (some involve language production rather than language perception/comprehension). MacKay (1978): morphological processing in production? Subjects given verbs: govern decide Must derive nouns: government decision Results: "government" fastest (no phonological changes) "decision" slowest (two changes: [aj] -> [I], [d] -> [zh]) Interpretation: subjects really do make those changes when producing morphologically complex words; therefore such words are not stored simply as wholes. Criticism: subjects are explicitly told to derive morphologically complex nouns from verbs! Perhaps people do not do this normally. Taft (1981): "prefix-stripping" in word recognition? Subjects given written words with prefixes: REMIND or with "pseudoprefixes" (i.e. not real prefix): RELISH Lexical decision task. Results: RT's are faster for prefixed words. Interpretation: Morphological processor automatically "strips off" anything that looks like a prefix (e.g. "RE"), then searches for the base in the lexicon. With words like REMIND, it will find MIND (real word), but with words like RELISH, will not find *LISH, thus wasting time. Criticism: although subjects are not explicitly told to strip off prefixes, maybe they are implicitly told this by the kind of word list they get. Rubin, Becker and Freeman (1979) found that if the word set contains 50% prefixed forms, subjects use a "prefix-stripping" strategy, but if there's only 10% prefixed forms (more realistic), subjects do not show this effect. Taft and Zhu (1995): Are morphemes relevant in the recognition of written Chinese words? Materials: "binding characters", i.e. characters that only appear in one word (e.g ³L°C ) -- because these are two- syllable monomorphemic words; and "non-binding, position-specific characters", i.e. characters that appear in more than one word, but always in first position (e.g. ®î ) or always in second position (e.g. «Q ). Task: Subjects saw individual characters on a computer screen, and had to "name" them (i.e. read them aloud). Results: Subjects were faster to name first-position binding characters (e.g. ³L ) than second-position binding characters (e.g. °C ), but non-binding characters showed no position effect. Interpretation: For monomorphemic "binding" words, subjects had to access the entire word in order to find the pronunciation of each character, which implies that such words are stored as whole units. For polymorphemic non- binding words, subjects accessed each character independently. Question: Is this just a matter of orthography, or would monomorphemic, non-binding words (e.g. ¨Fµo ) behave differently? That's for future research! Zhou and Marslen-Wilson (1994): Are morphemes relevant in the recognition of spoken Chinese words? Subjects were given 4 kinds of spoken two-morpheme words, all of which were semantically transparent (i.e. the meaning of the whole was closely related to the meaning of each part): high frequency word, high frequency first morpheme high frequency word, low frequency first morpheme low frequency word, high frequency first morpheme low frequency word, low frequency first morpheme Task: lexical decision (thus also equal number of nonwords) Results: strong effect of word frequency; no effect of morpheme frequency! Interpretation: In strong contrast to results on written Chinese characters, spoken Chinese words are NOT morphologically decomposed in comprehension (at least not initially -- priming studies like Zhou and Marslen-Wilson (1995) imply that morphemes within compounds are actively processed even in spoken word recognition). 3. Models of lexical semantics 3.1 Semantic feature theory (commonly assumed by linguists): Claim: A category is defined by the sharing of features. For example: NOUNS: man [MALE][ADULT] husband [MALE][ADULT][MARRIED] bachelor [MALE][ADULT][NOT[MARRIED]] horse [CONCRETE] [ANIMAL] [MAMMAL] ... robin [CONCRETE] [ANIMAL] [BIRD] ... tree [CONCRETE] [PLANT] ... dream [ABSTRACT] ... VERBS: die [BECOME] [[NOT] [ALIVE]] kill x [CAUSE] [[BECOME] [[NOT] [ALIVE (x)]]] murder x [CAUSE] [[BECOME] [[NOT] [ALIVE (x)]]] AND [PERSON (x)] Problem: Not all psychologically real categories can be accurately defined with features! "fruit" = "edible reproductive body of a seed plant" (thus includes "apple", "banana", "coconut", "tomato", "green pepper", "chilli pepper", etc) But Americans do not consider "tomato", "green pepper", and "chilli pepper" to be fruit! (instead, the first two are "vegetables" and the third is a "spice") Note cultural factor: Chinese do consider "tomato" to be a fruit! Also, there are "good examples" of fruit, and "bad examples." Thus "coconut" is technically a fruit, but it's not a good example. 3.2 Prototype theory (e.g. Rosch 1973): Claim: A category is defined by the sharing of "family resemblances", i.e. similarity to some other members in the category. There need not be a single feature or set of features shared by ALL the members. PROTOTYPE: the "best" member of the category, as determined by people's judgments; it is also typically named by the highest frequency word in its category. Example: Category "bird" in the USA, "robin" is prototypical. "eagle" is pretty good (shares many similarities with "robin") "penguin" and "ostrich" are pretty bad "bat" is terrible (it's not even in the category) What is the prototypical bird for Taiwanese??? A typical prototype effect in a SEMANTIC VERIFICATION task: Subjects are presented with sentences like: (1) "A robin is a bird." (2) "A penguin is a bird." (3) "A bat is a bird." (4) "A horse is a bird." Task: True or false? Results: (1) is responded to more quickly than (2), because "robin" is a more prototypical exemplar of the "bird" category. Prototype effects on Chinese classifiers: (Ahrens 1994) ±i : classifier for flat things prototypical example: ¯È Subjects must name objects shown in pictures; record choice of classifier, e.g. ±i¡B±ø¡B­Ó¡B°¦ Results: "correct" classifier is used less frequently for less prototypical examples. Percent usage of ­Ó or °¦ instead of ±i ¯È §É ®à¤l ¨Fµo 0% 30% 45% 65% Prototype effects can be modeled in connectionism by assuming there is a strong connection between a prototype node and a category node; e.g. ±i is strongly linked to ¯È , but only weakly linked to ¨Fµo . Logical problem with prototype theory: This theory tries to respond to the problems with the "feature" approach to lexical semantics. And yet prototype theory uses the concept of "similarity." How is "similarity" defined, except by using features???? REFERENCES Ahrens, K. (1994). Classifier production in normals and aphasics. Journal of Chinese Linguistics, 22, 203-247. Ahrens, K. (1998). Lexical ambiguity resolution: languages, tasks, and timing. Syntax and Semantics, 31, 11-31. Caplan, D. (1993). Language: Structure, processing and disorders. Cambridge, MA: MIT Press. Forster, K. I. (1976). Accessing the mental lexicon." In R. J. Wales and E. Walker (Eds.) New approaches to language mechanisms (pp. 257-287). North-Holland. Gaygen, Daniel E., & Luce, Paul A. (1998). Effects of modality on subjective frequency estimates and processing of spoken and printed words. Perception and Psychophysics, 60(3), 465-483. Hogaboam, T. W., & Perfetti, C. A. (1975). Lexical ambiguity and sentence comprehension. Journal of Verbal Learning and Verbal Behavior, 14, 265-274. Hung, D. L., Tzeng, O. J. L., & Ho, C.-Y. (1999). Word superiority effect in the visual processing of Chinese. In O. J. L. Tzeng (Ed.) Journal of Chinese Linguistics Monograph Series No. 13: The biological bases of language, 61-95. MacKay, D. G. (1978). Derivational rules and the internal lexicon. Journal of Verbal Learning and Verbal Behavior, 17, 61-71. Marslen-Wilson, W. (1987). Functional parallelism in spoken word recognition. Cognition, 25, 71-102. Paap, K. R., McDonald, J. E., Schvaneveldt, R. W., & Noel, R. W. (1987). Frequency and pronounceability in visual presented naming and lexical-decision tasks. In M. Coltheart (Ed.) The psychology of reading, vol. 12 of Attention and Performance. Erlbaum. Rosch, E. H. (1973). On the internal structure of perceptual and semantic categories. In T. E. Moore (Ed.) Cognitive development and the acquisition of language (pp. 111-144). Academic Press. Rubenstein, H., & Pollack, I. (1963). Word predictability and intelligibility. Journal of Verbal Learning and Verbal Behavior, 2, 147-158. Rubin, G. S., Becker, C. A., & Freeman, R. H. (1979). Morphological structure and its effect on visual word recognition. Journal of Verbal Learning and Verbal Behavior, 18, 757-767. Sereno, J. A., & Jongman, A. (1997). Processing of English inflectional morphology. Memory and Cognition, 25(4), 425-437. Swinney, D. A. (1979). Lexical access during sentence comprehension: (re)consideration of context effects. Journal of Verbal Learning and Verbal Behavior, 18, 645-659. Tabossi, P., & Zardon, F. (1993). Processing ambiguous words in context. Journal of Memory and Language, 32, 359-372. Taft, M. (1981). Prefix stripping revisited. Journal of Verbal Learning and Verbal Behavior, 20, 289-297. Taft, M., & Zhu X. (1995). The representation of bound morphemes in the lexicon: a Chinese study. In L. B. Feldman (Ed.) Morphological aspects of language processing (pp. 293-316). Lawrence Erlbaum. Zhou X., & Marslen-Wilson, W. (1994). Words, morphemes and syllables in the Chinese mental lexicon." Language and Cognitive Processes, 9(3), 393-422. Zhou, X., & Marslen-Wilson, W. (1995). Morphological structure in the Chinese mental lexicon. Language and Cognitive Processes, 10 (6), 545-600.