Home Language & Literature COGNITIVE APPROACH TO NATURAL LANGUAGE PROCESSING
In the first series of results, we examine the prediction of manually obtained CCP-derived predictability with corpus-based methods. A large amount of explained variance would indicate that predictability could be replaced by automatic methods. As a set of baseline predictors, we use pos and freq, which explains 0.243/0.288 of the variance for the first and the second half of the dataset, respectively. We report results in Table 10.2 for all single corpus-based predictors alone and in combination with the baseline, all combinations of the baseline with n-gram, topics and neural models from the same corpus.
Table 10.2. r2 explained variance of predictability, given for two halves of the dataset, for various combinations of baseline and corpus-based predictors
It is apparent that the n-gram scores best, and also the neural model alone reaches r2 levels that approach the baseline. In contrast, much as our earlier top-three topics approach [BIE 15], the mixture of all topics explains only a relatively low amount of variance. Combining the baseline with the n-gram predictor already reaches a level very close to the combination of all predictors, thus it may provide the best compromise between parsimony and explained variance. Again, this model performance is closely followed by the recurrent neural network (see Figure 10.1).
Figure 10.1. Prediction models exemplified for the NEWS corpus in the x-axes and the N = 1,138 predictability scores on the y-axes. A) Prediction by baseline + n-gram (r2 = 0.475), B) a recurrent neural network (r2 = 0.437) and C) a model containing all predictors (r2 = 0.478). The three pairwise Fisher’s r-to-z tests revealed no significant differences in explained variance (Ps > 0.18). For a color version of this figure, see www.iste.co.uk/sharp/cognitive. zip
We also fitted a model based on all corpus-based predictors from all corpora, which achieved the overall highest r2 (0.490/0.507). In summary, it becomes clear that about half of the empirical predictability variance can be explained by a combination of positional and frequency features combined with either a word n-gram language model, or a recurrent neural network.
|< Prev||CONTENTS||Next >|