Bybee's points of critique
Perspective 1: CA and its goals
The most frequent, but by no means only, implementation of CA uses pFYE as an AM, which (i) downgrades the influence of words that are frequent everywhere and (ii) weighs more highly observed relative frequencies of co-occurrence that are based on high absolute frequencies of co-occurrence. Bybee (2010: 97) criticizes this by stating that the "problem with this line of reasoning is that lexemes do not occur in corpora by pure chance" and that "it is entirely possible that the factors that make a lexeme high frequency in a corpus are precisely the factors that make it a central and defining member of the category of lexemes that occurs in a slot in a construction." Using the Spanish adjective solo 'alone' as an example, she goes on to say that, for solo, "Collostructional Analysis may give the wrong results [my emphasis, STG], because a high overall frequency will give the word solo a lower degree of attraction to the construction according to this formula" (2010: 98).
Perspective 2: CA and its mathematics/computation
Bybee (2010: 98) also takes issue with the of the bottom right cell in the 2x2 tables: "Unfortunately, there is some uncertainty about the fourth factor mentioned above the number of constructions in the corpus. There is no known way to count the number of constructions in a corpus because a given clause may instantiate multiple constructions." Later in the text, however, she mentions that Bybee & Eddington tried different corpus sizes and obtained "similar results" (Bybee 2010: 98).
Perspective 3: CA and its results, interpretation, and motivation
The perceived lack of semantics
Bybee criticizes CA for its lack of consideration of semantics. Specifically, she summarizes Bybee & Eddington (2006), who took "the most frequent adjectives occurring with each of four 'become' verbs as the centres of categories, with semantically related adjectives surrounding these central adjectives depending on their semantic similarity, as discussed above" (Bybee 2010: 98); this refers to Bybee & Eddington's (2006) classification of adjectives occurring with, say, quedarse, as semantically related. She then summarizes "[t]hus, our analysis uses both frequency and semantics" whereas "[p]roponents of Collostructional Analysis hope to arrive at a semantic analysis but do not include any semantic factors in their method. Since no semantic considerations go into the analysis, it seems plausible that no semantic analysis can emerge from it" (Bybee 2010: 98).
The perceived lacks of semantics and discriminatory power
The above claim is also related to the issue of discriminatory/predictive power. In an attempt to compare Bybee's raw frequency approach to CA, Bybee compares both approaches' discriminability with acceptability judgment data. For two Spanish verbs meaning 'become' (ponerse and quedarse) and twelve adjectives from three semantic groups (high freq. in c with these two verbs, low freq. in c but semantically related to the high freq. ones, and low freq. in c and semantically unrelated to the high freq. ones), the co-occurrence frequencies of the verbs and the adjectives, the frequency of the adjectives in the corpus, and the collostruction strengths were determined.
As Bybee mentions, frequency and collostruction strength make the same (correct) predictions regarding acceptability judgments for the high-frequency co-occurrences. However, semantically related low-frequency adjectives garner high acceptability judgments whereas semantically unrelated low-frequency adjectives do not. Bybee does not report any statistical analysis, but eyeballing the data seems to confirm this; she states "[o]f course, the Collostructional Analysis cannot make the distinction between semantically related and semantically unrelated since it works only with numbers and not with meaning" (2010: 100). She goes on to say "[t]hus for determining what lexemes are the best fit or the most central to a construction, a simple frequency analysis with semantic similarity produces the best results."
Finally, Bybee criticizes CA in terms of how "many such analyses" handle low-frequency collexemes, which are "ignored" (2010: 101). This is considered a problem because "low-frequency lexemes often show the productive expansion of the category" and "[w]ithout knowing what the range of low frequency, semantically related lexemes is, one cannot define the semantic category of lexemes that can be used in a construction" (p. 101).
The absence of cognitive mechanisms underlying CA
From the above claims regarding the relation between frequency, collostruction strength, (semantic similarity), and acceptability judgments, Bybee infers, in agreement with Goldberg's earlier research, that high-frequency lexical items in constructional slots are central to the meaning of a construction. However, she also goes on to claim that
Gries and colleagues argue for their statistical method but do not propose a cognitive mechanism that corresponds to their analysis. By what cognitive mechanism does a language user devalue a lexeme in a construction if it is of high frequency generally? This is the question Collostructional Analysis must address. (2010: 100f.)