Desktop version

Home arrow Language & Literature

  • Increase font
  • Decrease font

<<   CONTENTS   >>

The Process - Corpus Analysis

This section discusses the various tools that I used for the analysis of my core corpus data, i.e. Wmatrix, Sketch Engine, and NVivo. As indicated in the previous section of this chapter, I compiled two corpora - a broad corpus and the core corpus, which is a subcorpus of the broad corpus. The core corpus was further divided into a multimodal corpus (which included the original adverts and articles) and a monomodal corpus for which I extracted all the text present in the adverts and articles. The construction of this monomodal corpus provided an alternative way into the data and allowed me to look into various linguistic features and issues of lexis and grammar. Tools such as Wmatrix and Sketch Engine, which are only equipped for monomodal data, provide a rich insight into the language data (e.g. by POS-tagging and, in the case of Wmatrix, semantic domain tagging). The patterns and dominant semantic domains identified by the monomodal corpus tools helped inspire some of the coding of the multimodal corpus. Moreover, I returned to the monomodal corpus several times to explore linguistic patterns highlighted during the coding process of the multimodal corpus in NVivo.

Step One: Wmatrix and Sketch Engine

Wmatrix is a software corpus tool, developed by Paul Rayson, which is particularly appealing as it offers semantic tagging in addition to the more standard functions that are available in other corpus tools such as frequency lists, concordances, key words, etc. Adopting the UCREL

Semantic Analysis System (USAS), Wmatrix can flag up the dominant semantic domains present in the data and may point to patterns that would otherwise remain undetected. For example, the semantic tagger identified the importance of looking at ‘anonymity’, ‘confidentiality’, and ‘hiddenness’ in advertising for cosmetic procedures, as this proved a popular selling point in the 2001 adverts but decreased in popularity over time. In addition to the anonymous/confidential tag, some of the other results of the initial analysis of the monomodal corpus in Wmatrix inspired or confirmed final coding categories used in NVivo.

Alongside the existing semantic tags used by Wmatrix, I was able to introduce my own semantic tags where I felt the existing ones lacked specificity. For example, as my corpus included many technical, biological, and medical terms, I introduced a new label under tag Y (‘science and technology’) and expanded several tags to include terms that were prevalent in my corpus but were not recognised by the existing tagset (such as medical names for procedures).

An important caveat of the semantic tagger is that it cannot account satisfactorily for a word’s contextual or genre-specific meaning, although attempts at disambiguation have been made (see Rayson et al. 2004). For this reason, Rayson himself (2008: 528) has emphasised the importance of the researcher’s qualitative examination of the tags that the software assigns to a particular text. Following Rayson’s advice, I went through all of the semantic tags that Wmatrix had assigned to my dataset to make sure these tags had been assigned correctly. This protracted procedure turned out to be valuable as some of the tags that the software assigned seemed to refer to the connotative rather than the denotative. For example, ‘cosmetic surgery’, clearly a key term in my data set, was assigned the tags S1.2.3+ (selfish) and E4.1+ (happy). As I did not want to include these subjective interpretations, I had to manually compile an updated list of the USAS tagset specific to my corpus data and re-run the programme.

As well as making use of Wmatrix, I also found Sketch Engine helpful as it provided me with an insight into the relevant keywords present in adverts for cosmetic procedures and (other) beauty products/ser- vices. Moreover, the software provided me with a clear overview of the concordances around these particular keywords. As I coded my data in NVivo, I frequently returned to Sketch Engine to explore certain patterns in lexis that I thought may be relevant to my analysis (e.g. see the discussion of ‘contour’, ‘lift’, and ‘sculpt’ in Chapter 7.2).

Corpus Tools for the Analysis of Multimodal Data

As Bateman (2008: 249) has noted, multimodal analyses have been criticised for their impressionistic, informal, and interpretative analyses. Arguably, a degree of interpretation is inevitable in the analysis of graded rather than discrete signification. In an attempt to address these critiques, multimodal research is increasingly adopting corpus methods to aid in data collection and analysis. However, the development of multimodal corpus tools, especially those including non-linear data, has been slow as many tools are not equipped to engage with spatially or temporally organised content. Whereas syntactic parsers and part- of-speech taggers can be applied to monomodal data to aid analysis, this kind of extra information cannot be attached to multimodal data (Hiipala 2016: 214).

Because of the confusion and issues related to the creation of multimodal corpora, the available tools and the nature of the corpora themselves vary widely. The acceptable size of a multimodal corpus in particular has attracted discussion6. In his examination of tourist brochures, Hiipala (2016), for example, includes 58 double-paged spreads, whereas Lirola and Chovanec (2012) sampled 20 leaflets promoting cosmetic surgery of which they only discuss two. The variation in size of multimodal corpora can be explained by the different approaches to the analysis of the data. The application of Bateman’s (2008) Genre and Multimodality model, for example, yields a rich analysis but is time-consuming; the preparation of Hiipala’s corpus, for example, took three and a half years. However, it may not always be necessary, or even desirable, to produce such a rich analysis; as Hiipala (2016: 212-213) himself has noted, the broader (or more selective) analysis of larger multimodal corpora is better at “quantifying graphic elements, examining collocations, or drawing broad text-image relations...” and providing a “broader social and cultural perspective”.

Although few tools have been developed for the study of multimodal documents, several software programmes have either been designed especially for the cataloguing and arrangement of multimodal data - e.g. the Multimodal Analysis Image Software - or have been adapted to be able to organise and analyse both written and visual data (e.g. NVivo). As NVivo constitutes a powerful tool which includes several ways of coding both verbal and visual data, as is explored in the next section, I decided to use the software for the annotation and analysis of my data.

Step Two: NVivo

Computer-assisted qualitative data analysis software programmes such as NVivo are useful “data management packages” (Zamawe 2015), which provide tools for the collection, annotation, and analysis of datasets. As Hoover and Koerber (2011: 68) have argued, NVivo enhances the efficiency, multiplicity, and transparency of qualitative research. Considering the vast array of data collected for this project, the options that NVivo provides to organise these different materials was particularly attractive. Moreover, the possibility to arrange the data by means of hierarchically ordered ‘nodes’ and the various ways to query the data streamlined the coding and analysis of the data.

Building on themes in the literature and some of the patterns that emerged in the analysis of the monomodal corpus - in addition to several informal examinations of the multimodal data in the familiarisation stage - a preliminary list of coding categories was established in NVivo. As mentioned previously, these coding categories evolved during the process and were reconfigured several times. Especially the main (‘mother’) and sub (i.e. ‘child’) nodes were adjusted repeatedly, as particular themes were deemed to be part of larger, overarching themes. An example of the resulting layered coding categories can be seen in Figures 3.3 and 3.4 - here, the ‘celebrities’ and ‘nature’ coding category are the mother node, containing several child nodes to their right.

Although NVivo was not developed to engage with multimodal data, it provided a relatively straightforward way of coding both the visual and textual elements of the data. However, when the coding stage was completed and the data was queried, it transpired that NVivo is not fully reliable when used for the analysis of multimodal documents.

Example of coding category in NVivo

Figure 3.3 Example of coding category in NVivo.

Note: Celebrities is the mother node here, containing the child nodes textual reference and visual representation. In turn, visual representation contains several types of visual representations of celebrities, namely model, sports hero, actor/ actress, and miscellaneous.

Example of coding category in NVivo

Figure 3.4 Example of coding category in NVivo.

Note: Nature is the mother node here, containing the child nodes natural way or manner, body’s own, naturally ‘you’, natural ingredient or product, natural look or feel, and miscellaneous.

For example, for some reason rhe software would create graphs and tables in which the totals did not add up. Fortunately, this quickly became clear and thus the charts and statistics included in the following chapters have all been recounted by hand to make sure any errors were filtered out.

As discussed in the introduction to this chapter, this project comprised two stages. Having discussed the first, corpus-related, stage, the next section discusses the second stage of the research, which aimed to elicit the reception of the advertising and editorial materials. Moreover, the (group) interviews also provided a better understanding of people’s engagement with the current beauty market.

<<   CONTENTS   >>

Related topics