Lexemes and lemmas
Both grammatical and phonological words are kinds of word tokens. In examples (3.1) and (3.2), there are four occurrences of the word form hit, corresponding to two occurrences of the past participle hit, and one occurrence each of the preterite and infinitive. Given that grammatical words are defined as word forms with fixed meanings, their identification depends on assumptions about the parts of speech and morphosyntactic properties in a language.
Insofar as occurrences of grammatical words are tokens of a common word ‘type’, they imply another, more abstract notion of ‘word’. This notion of ‘word type’ usually goes under the name lexeme. Matthews (1972:160) characterizes the lexeme in this sense as “the lexical element... to which the forms in [a] paradigm as a whole ... can be said to belong”.11 In a later discussion of the same point, Matthews 
Figure 3.4 Varieties of‘words’ (cf. Matthews 1972,1991)
(1991:26) suggests that a lexeme is “a lexical unit and is entered in dictionaries as the fundamental element in the lexicon of language”. Lexemes are thus reminiscent of the lexicographer’s notion of a lemma, which is the citation form of an item or the headword under which it is listed in a dictionary.12 The connection between these notions is reinforced by the fact that lexemes are conventionally represented by the citation form of an item in small caps, i.e., by the lemma of the item. However, whereas a lemma is a distinguished form, a lexeme is normally construed as a set of grammatical words. The fundamental contrast between phonological words (word forms), grammatical words (words) and lexemes is summarized in Figure 3.4.
The fact that lexical meanings do not enter into this opposition reflects the discriminative approach to meaning outlined in Chapter 8.4. The three-way contrast between word forms, grammatical words and lexemes resolves a systematic ambiguity in the use of the term ‘word’ as applied to the form hit, the preterite hit and the lexeme hit. Not all items give rise to a full ternary split, since different notions of word may coincide in particular cases. There is usually at least a partial correlation between grammatical words and word forms. For closed-class categories, the distinction between lexeme and grammatical word may not be especially relevant or useful, since a preposition or conjunction will usually be associated with a single grammatical word. The same may be true of open-class categories in an isolating language such as Vietnamese if nouns and verbs are represented by single grammatical words.
Despite the fact that a lexeme may contain just a single grammatical word, treating lexemes as sets of words provides a coherent interpretation for an intuitive but otherwise formally obscure notion. By characterizing a lexeme as “the lexical unit that grammatical words are forms of”, Matthews (1972,1991) invites questions about the nature of lexical units. Subsequent refinements tend to preserve the same attractive intuition and formal unclarity. Aronoff (1994) summarizes the properties of lexemes in the following terms:
To recapitulate, a lexeme is a (potential or actual) member of a major lexical category, having both form and meaning but being neither, and existing outside of any particular syntactic context. (Aronoff 1994:11)
in Hockett (1958: i69ff.), who uses the term to designate sequences that always occur as grammatical forms in a context where they are not part of any larger unit which also invariably occurs as a grammatical form.
12 There is also an alternative interpretation of the term ‘lemma’ still in circulation. In psycholinguistic models of speech production, lemmas are often construed as abstract conceptual entries that represent “the nonphonological part of an item’s lexical information” (Levelt 1989).
Stump (2001:277) likewise states that “[t]he notion of lexeme assumed here is that of e.g. Matthews 1972, Aronoff 1994” and repeats the quotation from Aronoff (1994) above. Other works in this tradition, such as Beard (1995), construe ‘lexeme’ in a way that tacitly adopts these definitions.
The characterization offered by Aronoff (1994) does not readily apply to any kind of conventional lexical ‘unit’ or ‘item’ However, it does accurately describe the properties of a set of grammatical words. If each grammatical word is construed as a pairing of features with a form, the set of pairs comprising a lexeme will share ‘a major lexical category’ and ‘meaning, along with common elements of ‘form’, and a distributional range that is not coextensive with the ‘particular syntactic context’ in which any one of the grammatical words occurs.
Interpreting lexemes as sets of forms is also compatible with the disambiguating function of ‘lexemic indices’ in realizational accounts such as Stump (2001). The members of a given lexeme X can, if desired, be assigned an index iX. This index will then serve to distinguish grammatical words from different lexemes that happen to be realized by homophonous phonological words:
In general, it is necessary to regard a root X as being indexed for its association with a particular lexeme, since phonologically identical roots associated with distinct lexemes may exhibit distinct morphological behavior; in English, for example the root lie of the lexeme lie1 ‘recline’ must be distinguished from the root lie of the lexeme lie2 ‘prevaricate, since their paradigms are different (e.g. past tense lay vs. lied). I shall treat this indexing as covert, but shall use a function ‘L-index’ to make overt reference to it when necessary; thus lie ‘recline’ carries a covert index lie1 (so that L-index(lie) = lie1), while lie ‘prevaricate’ carries a covert index lie2 (so that L-index(lie) = lie2). (Stump 2001:43)
On a classical conception in which the main morphological part-whole relation holds between words and larger collections of forms, lexemes will be a significant unit of organization within a speaker’s mental lexicon. Lexemes will intervene between inflectional paradigms and morphological families. However, it is lemmas, not lexemes, that are ‘entered into dictionaries’. The conventional lemma form of an item represents the lexeme, and it is only in cases of irregularity that additional members of the lexeme are listed as well.
-  This usage is more restricted than the notion of‘lexeme’ as a minimal unit of syntactic analysis inLyons (1963: iif.), which also encompasses idiomatic phrases. An even earlier use of‘lexeme’ is found
-  Similar remarks apply to the use of lexeme indices to guide the application of realization rules inaccounts such as Blevins (2003) and Spencer (2003).